Applied aspects in informatics
Mathematical models of socio-economic processes
Dynamic systems
Scientometrics and management science
Recognition of images
D.L. Sholomov, A.G. Volkov, D.V. Polevoy Document identification in terms of linear programming
D.L. Sholomov, A.G. Volkov, D.V. Polevoy Document identification in terms of linear programming

Abstract.

The paper presents a method for document template description by rules for relative location of primitive elements. Such description reduces the problem of identifying a weakly structured document to the problem of integer linear programming. In this case, the maximized functional describes the document template matching rate and the rules for relative location are transformed into a number of linear inequalities.

Keywords:

document recognition, template description, template matching, flexible forms, document identification, linear programming, mass document input, graphical primitives, text recognition, invoice recognition.

PP. 74-80.

Reference

1. Postnikov V.V. Automatic identification and recognition of structured documents. // Dissertation for the degree of candidate  of technical sciences. Moscow, 2001.
2. Cesarini F., Gori M., Marinai S., and Soda G., INFORMys: A Flexible Invoice-Like Form-Reader System. // IEEE Trans. Pattern  Analysis and Machine Intelligence, vol. 20, no. 7, pp. 730-745, July 1998.
3. Cracknell C., Downton A.C., Du L., An Object-Oriented form Description Language and Approach to Handwritten Form  Processing. // ICDAR’97, IEEE, 1997
4. Peng H., Long F., Chi Z., and Siu W.-C., Document image template matching based on component block list. // Pattern  Recognition Letters, 2001
5. Taha H.A. Operations research: An introduction. // M.: Williams, Ed.6, 2001.
6. Shevchenko V.N., Zolotykh N.Yu. Linear and integer linear programming. // Ed. Nizhny Novgorod State
University, 2002.
7. Sholomov D.L. Syntactic methods of contextual processing in problems of text recognition. // Dissertation for the degree of  candidate of technical sciences. Moscow, 2007.
8. Sholomov D.L., Postnikov V.V., Marchenko A.A., Uskov A.V. Post-processing of OCR Results Using Automatically  Constructed Partially Defined Syntax. // Proceedings of the Institute for System Analysis RAS, Vol. 16. pp. 146-163, 2005.
9. Sholomov D.L. Correction of recognized text using classification methods. // Proceedings of the Institute for System Analysis  RAS, Vol. 29. pp. 356-380, 2007.
10. Arlazarov V.V., Malykh V.A., Sholomov D.L. Recognition of the document images with the usage of “Roulette” algorithm. //  Proceedings of the Institute for System Analysis RAS. Vol. 63, №4, pp. 35-38, 2013.
 

2024-74-3
2024-74-2
2024-74-1
2023-73-4

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".