Abstract.
The problem of extracting filling elements (fields) from a recognized image of a document with the help of descriptors - descriptions of one or more structural elements is considered. Structural elements can be words of static text and scribble lines used to shape the design of a document. Business documents with a simplified structure and a limited vocabulary are considered. Flexible business documents that allow significant modifications to the page design are considered. Descriptors are created taking into account a significant number of possible errors in document page recognition. Combined descriptors consisting of several terms and line segments are described. A binding algorithm based on descriptors is given. It is experimentally shown that the extraction of combined descriptors improves the accuracy of recognition of document fields during recognition by 17%, and the accuracy of extracting information from the document image by 16%. The SDK Smart Document Engine was used as OCR in the experiment.
Keywords:
virtual reality, augmented reality, virtual reality helmet, immersiveness, virtual object, heptic technologies, content.
pp. 13-24.
DOI 10.14357/20718632220402 References
1. Bashkatova, A. Cifrovaya ekonomika plodit vse bol'she bumag: Rossiyane ne skoro perestanut nosit' v organizacii spravki // Nezavisimaya Gazeta. – 2019 – 14 ноя. https://www.ng.ru/economics/2019-11-14/4_7727_paper.html (accessed September 22, 2022). 2. Rusiñol M., Frinken V., Karatzas, D., Bagdanov, A. D., Lladós, J.: Multimodal page classification inadministrative document image streams. In: IJDAR. Vol. 17(4), pp. 331 Image Classification by Mixed Finite Element Method and Orthogonal Legendre Moments 341. (2014). https://doi.org/10.1007/s10032-014-0225-8. 3. Jain, R., Wigington, C.: Multimodal Document Image Classification. pp. 71–77. (2019). https://doi.org/10.1109/ICDAR.2019.00021. 4. Qasim, S. Rukh., Mahmood, H., Shafait, F.: Rethinking Table Recognition using Graph Neural Networks. pp. 142–147. (2019). https://doi.org/10.1109/ICDAR.2019.00031. 5. Marchenko A.E., Ershov E.I., Gladilin S.A. Sistema razbora dokumenta, zadannogo atributami strukturnykh elementov i otnosheniyami mezhdu strukturnymi elementami [The system for parsing a document specified by attributes of structural elements and the relations between structural elements] / Trudy ISA RAN, Vol 67, No 4, pp. 87-97. (2017). 6. Postnikov V. V.: Identification and Recognition of Documents with a Predefined Structure // Pattern Recognition and Image Analysis. Vol. 13. № 2. pp. 332–334. (2003). 7. Smart Document Engine – automatic analysis and data extraction from business documents for desktop, server and mobile platforms / https://smartengines.com/ocrengines/document-scanner (accessed September 22, 2022). 8. Bellavia, F.: SIFT Matching by Context Exposed. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2022). https://doi.org/10.1109/TPAMI.2022.3161853. 9. Bay, H., Tuytelaars, T., Van Gool, Luc.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding - CVIU. Vol. 110. No. 3, pp. 404–417. (2006). 10. Slavin, O., Andreeva, E., Paramonov, N.: Matching Digital Copies of Documents Based on OCR, 2019 XXI International Conference Complex Systems: Control and Modeling Problems (CSCMP), pp. 177–181, (2019). https://doi.org/10.1109/CSCMP45713.2019.8976570. 11. Slavin, O., Arlazarov, V., Tarkhanov, I.: Models and Methods Flexible Documents Matching Based on the Recognized Words. Cyber-Physical Systems: Advances in Design & Modelling. Springer Nature Switzerland AG. Vol. 350, pp. 173–184 (2021). https://doi.org/10.1007/978-3-030-67892-0_15. 12. Matas, J., Galambos, C., Kittler, J.: Robust Detection of Lines Using the Progressive Probabilistic Hough Transform, Computer Vision and Image Understanding, Vol. 78, Issue 1, pp. 119–137, (2000). https://doi.org/10.1006/cviu.1999.0831. 13. Grompone von Gioi, R., Jakubowicz, J., Morel, JM. et al.: On Straight Line Segment Detection. J Math Imaging Vis. Vol. 32, pp. 313–347. (2008). https://doi.org/10.1007/s10851-008-0102-5. 14. Grompone von Gioi R., Jakubowicz J., Morel J.-M., Randall G.: LSD: A Fast Line Segment Detector with a False Detection Control / IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 32, Issue 4. pp. 722–732. (2010). https://doi.org/10.1109/TPAMI.2008.300. 15. Emaletdinova, L. & Nazarov, M.: Construction of a Fuzzy Model for Contour Selection. Construction of a Fuzzy Model for Contour Selection. In: Kravets, A.G., Bolshakov, A.A., Shcherbakov, M. (eds) Cyber-Physical Systems: Intelligent Models and Algorithms. Studies in Systems, Decision and Control, Vol. 417. pp. 243–246. (2022). https://doi.org/10.1007/978-3-030-95116-0_20. 16. Zlobin, P., Chernyshova, Y., Sheshkus A., Arlazarov V. V.: Character sequence prediction method for training data creation in the task of text recognition. Proc. SPIE 12084, Fourteenth International Conference on Machine Vision (ICMV 2021), 120840R. (2022). https://doi.org/10.1117/12.2623773. 17. Matalov, D., Usilin, S., Arlazarov, V.V.: About Viola-Jones image classifier structure in the problem of stamp detection in document images. Proc. SPIE 11605, Thirteenth International Conference on Machine Vision, 116050V (2021). https://doi.org/10.1117/12.2586842. 18. Arlazarov, V., Voysyat, Ju. S., Matalov, D., Nikolaev, D., Usilin, S.A.: Evolution of the Viola-Jones Object Detection Method: A Survey. Vol. 14. pp. 52–23. (2021). https://doi.org/10.14529/mmp210401.
|