Abstract.
Definitions of flexible and rigid documents used in technologies for entering administrative documents into a computer are proposed. The features of creating, digitizing and analyzing rigid forms and rigid documents are considered. The limits of applicability of the model for linking images of rigid documents distorted during digitization are described. A model for linking flexible documents is considered, based on recognized words and graphic primitives connected by a set of order relations. The classification is based on various methods of preparing administrative documents for printing. The features of field binding and recognition for several types of documents are described, such as conditionally rigid documents, flexible documents produced by one form, flexible documents produced by a small and large number of forms. The case of recognition of conditionally rigid documents using flexible document input technologies is considered. The experiments carried out show that for some fields of notes in conditions of strong noise and significant distortion, the proportion of errors is reduced by half.
Keywords:
document recognition, conditionally rigid document, text feature point, check point.
PP. 37-48. DOI 10.14357/20718632230404
EDN BXZTGE References
1. Rusiñol M., Frinken V., Karatzas, D., Bagdanov, A. D., Lladós, J.: Multimodal page classification inadministrative document image streams. In: IJDAR. 17(4), 331–341 (2014). https://doi.org/10.1007/s10032-014-0225-8 2. Postnikov V. V.: Identification and Recognition of Documents with a Predefined Structure // Pattern Recognition and Image Analysis. 13(2), 332–334 (2003) 3. Jain, R., Wigington, C.: Multimodal Document Image Classification. 71–77 (2019). https://doi.org/10.1109/ICDAR.2019.00021 4. Qasim, S. Rukh., Mahmood, H., Shafait, F.: Rethinking Table Recognition using Graph Neural Networks. 142–147 (2019). https://doi.org/10.1109/ICDAR.2019.00031 5. Vasiliev, S.S., Korobkin, D.M., Kravets, A.G., Fomenkov, S.A., Kolesnikov, S.G.: Extraction of cyber-physical systems inventions’ structural elements of russian-language patents. Stud. Syst. Springer, Decis. Control, 259, 55–68 (2020). https://doi.org/10.1007/978-3-030-32579-4_5 6. Zlobin, P., Chernyshova, Y., Sheshkus A., Arlazarov V. V.: Character sequence prediction method for training data creation in the task of text recognition. Proc. SPIE 12084, Fourteenth International Conference on Machine Vision (ICMV 2021), 120840R (2022). https://doi.org/10.1117/12.2623773 7. Augereau, O., Journet, N., Domenger, J.-P.: Semi-structured document image matching and recognition/ IS&T/SPIE Electronic Imaging., 13–24 (2013). https://doi.org/10.1117/12.2003911 8. Skoryukina, N, Arlazarov, V, Nikolaev, D.: Fast Method of ID Documents Location and Type Identification for Mobile and Server Application. IEEE International Conference on Document Analysis and Recognition (ICDAR): 850–857 (2019). https://doi.org/10.1109/ICDAR.2019.00141 9. Bellavia, F.: SIFT Matching by Context Exposed. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2022). https://doi.org/10.1109/TPAMI.2022.3161853 10. Skoryukina, N., Faradjev, I., Bulatov, K;, Arlazarov, V.: Impact of geometrical restrictions in RANSAC sampling on the ID document classification. Proc. SPIE 11433, Twelfth International Conference on Machine Vision (ICMV 2020), 1143306R (2020). https://doi.org/10.1117/12.2559306 11. Slavin, O., Arlazarov, V., Tarkhanov, I.: Models and Methods Flexible Documents Matching Based on the Recognized Words. Cyber-Physical Systems: Advances in Design & Modelling. Springer Nature Switzerland AG. 350, 173–184. (2021). https://doi.org/10.1007/978-3-030-67892-0_15 12. Bay, H., Tuytelaars, T., Van Gool, Luc.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding - CVIU. 110(3), 404–417 (2006). 13. Matas, J., Galambos, C., Kittler, J.: Robust Detection of Lines Using the Progressive Probabilistic Hough Transform, Computer Vision and Image Understanding, 78(1), 119–137 (2000). https://doi.org/10.1006/cviu.1999.0831 14. Grompone von Gioi R., Jakubowicz J., Morel J.-M., Randall G.: LSD: A Fast Line Segment Detector with a False Detection Control / IEEE Transactions on Pattern Analysis and Machine Intelligence. 32(4), 722–732 (2010). https://doi.org/10.1109/TPAMI.2008.300 15. Emaletdinova, L. & Nazarov, M.: Construction of a Fuzzy Model for Contour Selection. Construction of a Fuzzy Model for Contour Selection. In: Kravets, A.G., Bolshakov, A.A., Shcherbakov, M. (eds) Cyber-Physical Systems:Intelligent Models and Algorithms. Studies in Systems, Decision and Control, 417, 243–246 (2022). https://doi.org/10.1007/978-3-030-95116-0_20 16. Palm, R. B., Winther, O., Laws F.: CloudScan - A Configuration-Free Invoice Analysis System Using Recurrent Neural Networks. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 406-413 (2017). https://doi.org/10.1109/ICDAR.2017.74 17. Pegu, B., Singh, M., Agarwal, A., Mitra, A., Singh, K.: Table Structure Recognition Using CoDec Encoder-Decoder. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. Lecture Notes in Computer Science, 12917, 66-80 (2021). https://doi.org/10.1007/978-3-030-86159-9_5 18. Slavin, O. A.: Using Special Text Points in the Recognition of Documents. Studies in Systems, Decision and Control. Springer Nature Switzerland AG., 259, 43–53 (2020). https://doi.org/10.1007/978-3-030-32579-4_4 19. Smart Document Engine – automatic analysis and data extraction from business documents for desktop, server and mobile platforms. https://smartengines.com/ocr-engines/ document-scanner. Last access 16 may 2023 20. Awal, A.M., Ghanmi, N., Sicre, R., Furon, T.: Complex Document Classification and Localization Application on Identity Document Images. Proc. 14th IAPR International Conference on Document Analysis and Recognition. 427- 432 (2017). https://doi 10.1109/ICDAR.2017.77
|