Журнал «Труды Института системного анализа Российской академии наук» - D.A. Ilin Fast words boundaries localization in text fields for low quality document images

Просматривается номер 2018-S1

Data mining and image recognition

N.S. Skoryukina, A.N. Milovzorov, D.V. Polevoy, V.V. Arlazarov Paintings recognition in uncontrolled conditions using one-shot learning

A.E. Zhukovsky Methods for interframe integration of document detection results in a video stream of a mobile device

I.A. Kunina, E. I. Panfilova, M.A. Povolotskiy Zebra-crossing detection on road images using dynamic time warping

O.A. Slavin, V.L. Arlazarov Method for classifying recognized pages of administrative documents on the basis of text key points

O.O. Petrova, K.B. Bulatov Methods of machine-readable zone recognition results post-processing

A.E. Marchenko, E.I. Ershov, D.A. Shepelev, D.S. Sidorchuk, V.P. Bozhkova, D.P. Nikolaev Designing of language of description of observable properties of recognized objects in the absence of samples

Intellectual systems and technologies

E.E. Limonova, N.L. Rzhenev, A.V. Uskov, M.I. Neiman-zade Fast implementation of Hamming distance on VLIW-architectures on the example of Elbrus platform

V.V. Arlazarov, K.B. Bulatov, A.V. Uskov A model of object recognition system in video stream of a mobile device

A.A. Ivanova, S.A. Gladilin, A.E. Zhukovsky, E.L. Pliskin Database for the administrative accounting of scientific publications

A.S. Ingacheva, A.V. Sheshkus, T. S. Chernov, E.E. Limonova, V.V. Arlazarov X-ray computed tomography scanner – a new tool in recognition

N.O. Beshaposhnikov, A.G. Kushnirenko, A.A. Levin A method for auto-calibration of the educational robot control parameters using computer vision library OpenCV

Image and signal processing

A.E. Zhukovsky, E.E. Limonova, D.P. Nikolaev Exact implementation of common image processing algorithms using fully convolutional networks

V.E. Prun Reducing the influence of high-absorbing inclusions on CT reconstructions using algebraic reconstruction technique

B.I. Savelyev, I.B. Mamay, D.P. Nikolaev, V.L. Arlazarov, K.B. Bulatov, N.S. Skoryukina A method of projective transformations graph adjustment for panorama stitching problem for images of planar objects

D.V. Tropin, D.P. Nikolaev, D.G. Slugin The method of image alignment based on sharpness maximization

J.A. Shemiakina, A.E. Zhukovsky, I.A. Konovalenko, D.P. Nikolaev Algorithm for automatic framing of digital images under projective transformation

MACHINE LEARNING

A.V. Gayer, A.V. Sheshkus, Y.S. Chernyshova Augmentation on the fly for the neural networks learning

V.V. Arlazarov, D.P. Matalov, S.A. Usilin Localization of the seal on the identity document image using machine learning approach

A.E. Lynchenko, A.V.Sheshkus, V.L.Arlazarov Identity document classifiaction algorithm based on similarity metric robust to projective distortions

V.A. Malykh, V.A. Lyalin On Classification of Noisy Texts

Y.S. Chernyshova, M.A. Aliev, A.V. Sheshkus Optical font recognition of images captured with mobile devices and its application for detecting identity documents forgery

D.A. Ilin Fast words boundaries localization in text fields for low quality document images

D.E. Ivanov, D.V. Polevoy, D.L. Sholomov Selection of informative elements for the training of a lightweight convolutional neural network classifier in the conditions of a strong imbalance of the training sample


	D.A. Ilin Fast words boundaries localization in text fields for low quality document images
Abstract. The paper examines the problem of word boundaries precise localization in document text zones. Document processing on a mobile device consists of document localization, perspective correction, localization of individual fields, finding words in separate zones, segmentation and recognition. While capturing an image with a mobile digital camera under uncontrolled capturing conditions, digital noise, perspective distortions or glares may occur. However, the problem of word boundaries localization has to be solved at run-time on mobile CPU with limited computing capabilities under specified restrictions. The method presented in this paper solves a more specialized problem than the task of finding text on natural images. It uses local features, a sliding window and a lightweight neural network in order to achieve an optimal algorithm speed-precision ratio. The duration of the algorithm is 12 ms per field running on an ARM processor of a mobile device. The error rate for boundaries localization on a test sample of 8000 fields is 0.3%. Keywords: localization, image, document processing, computer vision. PP. 192-198. DOI: 10.14357/20790279180522 References 1. Hiromichi Fujisawa. Forty Years of Research in Character and Document Recognition – an Industrial Perspective, Pattern Recognition. 2. David Doermann, Karl Tombre. Handbook of Document Image Processing and Recognition, Springer-Verlag, London, 2014. 3. Liang, Jian, David Doermann, and Huiping Li. Camera-based analysis of text and documents: a survey, International Journal of Document Analysis and Recognition (IJDAR), 2005. 4. Lu, Tong and Palaiahnakote, Shivakumara and Tan, Chew Lim and Liu, Wenyin. Video Text Detection, Springer-Verlag, London, 2014. 5. Natalya Skoryukina, Dmitry P. Nikolaev, Alexander Sheshkus,Dmitry Polevoy. Real time rectangular document detection on mobile devices, Proc. SPIE 9445, Seventh International Conference on Machine Vision (ICMV 2014), 94452A (February 12, 2015). 6. Limonova, E., Ilin, D. and Nikolaev, D. 2015. Improving neural network performance on SIMD architectures. Eighth International Conference on Machine Vision. Barcelona, Spain. 7. Limonova, E., Sheshkus, A. and Nikolaev, D. 2016. Computational optimization of convolutional neural networks using separated filters architecture. International Journal of Applied Engineering Research. 8. Yi Lu. Machine printed character segmentation An overview, Pattern Recognition, vol. 28, no. 1, pp. 6780, 1995. 9. Richard G. Casey and Eric Lecolinet. A Survey of Methods and Strategies in Character Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 690-706, 1996. 10. Grafmller, M., Beyerer, J. Segmentation of printed gray scale dot matrix characters, Proceedings of 14th world multi-conference on systemics, cybernetics and informatics WMSCI 2010 (Vol. II, pp. 87-91). 11. F. LeBourgeois. Robust Multifont OCR System from Gray Level Images, International Conference on Document Analysis and Recognition, vol. 0, p. 1, 1997. 12. Q. Ye and D. Doermann Text Detection and Recognition in Imagery: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 7, pp. 1480-1500, July 1 2015. 13. X. C. Yin, Z. Y. Zuo, S. Tian and C. L. Liu. Text Detection, Tracking and Recognition in Video: A Comprehensive Survey, IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2752-2773, June 2016.

2025-75-1

2024-74-4

2024-74-3

2024-74-2

Abstract.

Keywords: