Data mining and image recognition
Intellectual systems and technologies
Image and signal processing
MACHINE LEARNING
D.A. Ilin Fast words boundaries localization in text fields for low quality document images
D.A. Ilin Fast words boundaries localization in text fields for low quality document images

Abstract.

The paper examines the problem of word boundaries precise localization in document text zones. Document processing on a mobile device consists of document localization, perspective correction, localization of individual fields, finding words in separate zones, segmentation and recognition. While capturing an image with a mobile digital camera under uncontrolled capturing conditions, digital noise, perspective distortions or glares may occur. However, the problem of word boundaries localization has to be solved at run-time on mobile CPU with limited computing capabilities under specified restrictions. The method presented in this paper solves a more specialized problem than the task of finding text on natural images. It uses local features, a sliding window and a lightweight neural network in order to achieve an optimal algorithm speed-precision ratio. The duration of the algorithm is 12 ms per field running on an ARM processor of a mobile device. The error rate for boundaries localization on a test sample of 8000 fields is 0.3%.

Keywords:

localization, image, document processing, computer vision.

PP. 192-198.

DOI: 10.14357/20790279180522

References

1. Hiromichi Fujisawa. Forty Years of Research in Character and Document Recognition – an Industrial Perspective, Pattern Recognition.
2. David Doermann, Karl Tombre. Handbook of Document Image Processing and Recognition, Springer-Verlag, London, 2014.
3. Liang, Jian, David Doermann, and Huiping Li. Camera-based analysis of text and documents: a survey, International Journal of Document Analysis and Recognition (IJDAR), 2005.
4. Lu, Tong and Palaiahnakote, Shivakumara and Tan, Chew Lim and Liu, Wenyin. Video Text Detection, Springer-Verlag, London, 2014.
5. Natalya Skoryukina, Dmitry P. Nikolaev, Alexander Sheshkus,Dmitry Polevoy. Real time rectangular document detection on mobile devices, Proc. SPIE 9445, Seventh International Conference on Machine Vision (ICMV 2014), 94452A (February 12, 2015).
6. Limonova, E., Ilin, D. and Nikolaev, D. 2015. Improving neural network performance on SIMD architectures. Eighth International Conference on Machine Vision. Barcelona, Spain.
7. Limonova, E., Sheshkus, A. and Nikolaev, D. 2016. Computational optimization of convolutional neural networks using separated filters architecture. International Journal of Applied Engineering Research.
8. Yi Lu. Machine printed character segmentation An overview, Pattern Recognition, vol. 28, no. 1, pp. 6780, 1995.
9. Richard G. Casey and Eric Lecolinet. A Survey of Methods and Strategies in Character Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 690-706, 1996.
10. Grafmller, M., Beyerer, J. Segmentation of printed gray scale dot matrix characters, Proceedings of 14th world multi-conference on systemics, cybernetics and informatics WMSCI 2010 (Vol. II, pp. 87-91).
11. F. LeBourgeois. Robust Multifont OCR System from Gray Level Images, International Conference on Document Analysis and Recognition, vol. 0, p. 1, 1997.
12. Q. Ye and D. Doermann Text Detection and Recognition in Imagery: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 7, pp. 1480-1500, July 1 2015.
13. X. C. Yin, Z. Y. Zuo, S. Tian and C. L. Liu. Text Detection, Tracking and Recognition in Video: A Comprehensive Survey, IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2752-2773, June 2016.
 

2024-74-1
2023-73-4
2023-73-3
2023-73-2

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".