Журнал «Информационные технологии и вычислительные системы» - О. А. Славин "Алгоритмы привязки полей при распознавании условно-жестких деловых документов"

Предложены определения гибких и жестких документов, используемые в технологиях ввода в компьютер деловых документов. Рассмотрены особенности создания, оцифровки и анализа жестких форм и жестких документов. Описаны границы применимости модели привязки изображений жестких документов, искаженных при оцифровке. Рассмотрена модель для привязки гибких документов, основанная на распознанных словах и графических примитивах, связанных набором отношений порядка. Классификация основана на различных способах подготовки деловых документов для печати. Описаны особенности привязки полей и распознавания для нескольких типов документов, таких как условно-жесткие документы, гибкие документы, продуцированные одной формой, гибкие документы, продуцированные малым и большим числом форм. Рассмотрен случай распознавания условно-жестких документов с применением технологий ввода гибких документов. Проведенные эксперименты показывают, что для некоторых полей пометок в условиях сильного зашумления и значительных искажений доля ошибок уменьшается в два раза.

EDN BXZTGE

Литература

1. Rusiñol M., Frinken V., Karatzas, D., Bagdanov, A. D., Lladós, J.: Multimodal page classification inadministrative document image streams. In: IJDAR. 17(4), 331–341 (2014). https://doi.org/10.1007/s10032-014-0225-8

2. Postnikov V. V.: Identification and Recognition of Documents with a Predefined Structure // Pattern Recognition and Image Analysis. 13(2), 332–334 (2003)

3. Jain, R., Wigington, C.: Multimodal Document Image Classification. 71–77 (2019). https://doi.org/10.1109/ICDAR.2019.00021

4. Qasim, S. Rukh., Mahmood, H., Shafait, F.: Rethinking Table Recognition using Graph Neural Networks. 142–147 (2019). https://doi.org/10.1109/ICDAR.2019.00031

5. Vasiliev, S.S., Korobkin, D.M., Kravets, A.G., Fomenkov, S.A., Kolesnikov, S.G.: Extraction of cyber-physical systems inventions’ structural elements of russian-language patents. Stud. Syst. Springer, Decis. Control, 259, 55–68 (2020). https://doi.org/10.1007/978-3-030-32579-4_5

6. Zlobin, P., Chernyshova, Y., Sheshkus A., Arlazarov V. V.: Character sequence prediction method for training data creation in the task of text recognition. Proc. SPIE 12084, Fourteenth International Conference on Machine Vision (ICMV 2021), 120840R (2022). https://doi.org/10.1117/12.2623773

7. Augereau, O., Journet, N., Domenger, J.-P.: Semi-structured document image matching and recognition/ IS&T/SPIE Electronic Imaging., 13–24 (2013).

https://doi.org/10.1117/12.2003911

8. Skoryukina, N, Arlazarov, V, Nikolaev, D.: Fast Method of ID Documents Location and Type Identification for Mobile and Server Application. IEEE International Conference on Document Analysis and Recognition (ICDAR): 850–857 (2019). https://doi.org/10.1109/ICDAR.2019.00141

9. Bellavia, F.: SIFT Matching by Context Exposed. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2022). https://doi.org/10.1109/TPAMI.2022.3161853

10. Skoryukina, N., Faradjev, I., Bulatov, K;, Arlazarov, V.: Impact of geometrical restrictions in RANSAC sampling on the ID document classification. Proc. SPIE 11433, Twelfth International Conference on Machine Vision (ICMV 2020), 1143306R (2020). https://doi.org/10.1117/12.2559306

11. Slavin, O., Arlazarov, V., Tarkhanov, I.: Models and Methods Flexible Documents Matching Based on the Recognized Words. Cyber-Physical Systems: Advances in Design & Modelling. Springer Nature Switzerland AG. 350, 173–184. (2021).

https://doi.org/10.1007/978-3-030-67892-0_15

12. Bay, H., Tuytelaars, T., Van Gool, Luc.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding - CVIU. 110(3), 404–417 (2006).

13. Matas, J., Galambos, C., Kittler, J.: Robust Detection of Lines Using the Progressive Probabilistic Hough Transform, Computer Vision and Image Understanding, 78(1), 119–137 (2000). https://doi.org/10.1006/cviu.1999.0831

14. Grompone von Gioi R., Jakubowicz J., Morel J.-M., Randall G.: LSD: A Fast Line Segment Detector with a False Detection Control / IEEE Transactions on Pattern Analysis and Machine Intelligence. 32(4), 722–732 (2010).

https://doi.org/10.1109/TPAMI.2008.300

15. Emaletdinova, L. & Nazarov, M.: Construction of a Fuzzy Model for Contour Selection. Construction of a Fuzzy Model for Contour Selection. In: Kravets, A.G., Bolshakov, A.A., Shcherbakov, M. (eds) Cyber-Physical Systems: Intelligent Models and Algorithms. Studies in Systems, Decision and Control, 417, 243–246 (2022).

https://doi.org/10.1007/978-3-030-95116-0_20

16. Palm, R. B., Winther, O., Laws F.: CloudScan - A Configuration-Free Invoice Analysis System Using Recurrent Neural Networks. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 406-413 (2017). https://doi.org/10.1109/ICDAR.2017.74

17. Pegu, B., Singh, M., Agarwal, A., Mitra, A., Singh, K.: Table Structure Recognition Using CoDec Encoder-Decoder. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. Lecture Notes in Computer Science, 12917, 66-80 (2021).

https://doi.org/10.1007/978-3-030-86159-9_5

18. Slavin, O. A.: Using Special Text Points in the Recognition of Documents. Studies in Systems, Decision and Control. Springer Nature Switzerland AG., 259, 43–53 (2020).

https://doi.org/10.1007/978-3-030-32579-4_4

19. Smart Document Engine – automatic analysis and data extraction from business documents for desktop, server and mobile platforms. https://smartengines.com/ocr-engines/ document-scanner. Last access 16 may 2023

20. Awal, A.M., Ghanmi, N., Sicre, R., Furon, T.: Complex Document Classification and Localization Application on Identity Document Images. Proc. 14th IAPR International Conference on Document Analysis and Recognition. 427- 432 (2017). https://doi 10.1109/ICDAR.2017.77