Data mining and image recognition
A.E. Zhukovsky Methods for interframe integration of document detection results in a video stream of a mobile device
Intellectual systems and technologies
Image and signal processing
MACHINE LEARNING
A.E. Zhukovsky Methods for interframe integration of document detection results in a video stream of a mobile device

Abstract.

The paper is devoted to the task of detecting the position of a document in a video stream received from a mobile device. Particular attention is paid to the methods of integrating the positions of the document obtained on a sequence of frames. The paper describes an algorithm based on the Kalman filter for selecting the document positions for a set of provided alternatives, their integration and refinement in the video stream. The analysis of the performance of the algorithm on the dataset provided in the of the ICDAR’15 competition on detection of documents from the smartphone is given.

Keywords:

document detection, video stream, integration, projective transformation, mobile cameras, Kalman filter.

PP. 15-22. 

DOI: 10.14357/20790279180502

References

1. V.V. Arlazarov, A.E. Zhukovsky, V.E. Krivtsov, D.P. Nikolaev, D.V. Polevoy. Analiz osobennostey ispol’zovaniya statsionarnykh i mobil’nykh malorazmernykh tsifrovykh video kamer dlya raspoznavaniya dokumentov, [Analysis of specific character of usage fixed and mobile smallsize video cameras for document recognition], Informatsionnyye tekhnologii i vychislitel’nyye sistemy [Information Technologies and computing systems], Vol. 3, 2014, pp. 71-81
2. J. Liang, D. Doermann, H.Li. Camera-based analysis of text and documents: a survey, Int. J. of Document Analysis and Recognition, vol. 7, Issue 2, 2005, pp. 84-104
3. D. Doermann, J. Liang, H. Li. Progress in Camera-Based Document Image Analysis, IEEE Proc. 7th Int. Conf. on Document Analysis and Recognition, Vol.1, 2003, pp. 606-616
4. K. Bulatov, V.V. Arlazarov, T. Chernov, O. Slavin and D. Nikolaev. Smart IDReader: Document Recognition in Video Stream, 2017 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), 2017, pp. 39-44. doi: 10.1109/ICDAR.2017.347
5. Burie JC., Chazalon J., Coustaty M., Eskenazi S., Luqman M.M., Mehri M., Nayef N., Ogier JM., Prum S., Rusinol M. ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc), 13th Int. Conf. on Document Analysis and Recognition. 2015
6. V.V. Arlazarov, A.E. Zhukovsky, V.E. Krivtsov, V.V. Postnikv. Ispol’zovaniye grafa peresecheniy v zadache obnaruzheniya dokumenta na izobrazhenii, poluchennom so smartfona [Usage of the intersection graph in the task of camerabased document detection.] Ispol’zovaniye grafa peresecheniy v zadache obnaruzheniya dokumenta na izobrazhenii, poluchennom so smartfona, Iskusstvennyy Intellekt i Prinyatiye Resheniy [Artificial Intelligence and Decision Making], vol. 2, pp. 60-69, 2016
7. A. Zhukovsky et al. “Segments Graph-Based Approach for Document Capture in a Smartphone Video Stream,” 2017 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), 2017, pp. 337-342, doi: 10.1109/ICDAR.2017.63
8. Skoryukina N, Shemyakina Y., Arlazarov V.L., Faradjev I. Document localization algorithms based on feature points and straight lines, Proc. SPIE 10696, 10th Int. Conf. on Machine Vision (ICMV 2017), pp. 1-8, 2018, DOI: 10.1117/12.2311478
9. T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein. Introduction to Algorithms (second ed.). MIT Press and McGraw-Hill. ISBN 978-0-262-53196-2., 2001
10. R.E. Kalman. A New Approach to Linear Filtering and Prediction Problems, J. of Basic Engineering 82, 35, 1960.
11. R. Hartley, A. Zisserman. Multiple view geometry in computer vision, Cambridge University Press, New York, 2003
12. Y.A. Shemyakina, A.E. Zhukovsky, I.A. Faradjev. Issledovaniye algoritmov vychisleniya proyektivnogo preobrazovaniya v zadache navedeniya na planarnyy ob”yekt po osobym tochkam [Investigation of algorithms for calculating a projective transformation in the problem of targeting to a planar object from feature points], Iskusstvennyy Intellekt i Prinyatiye Resheniy [Artificial Intelligence and Decision Making], vol. 1, 2017, pp. 43-49
13. Y. Shemyakina, A. Zhukovsky, I. Faradjev. The Calculation of a Projective Transformation in the Problem of Planar Object Targeting by Feature Points, Proc. SPIE 10341, ICMV 2016, 10341 ed., 9th Int. Conf. on Machine Vision, 2017, vol. 10341, pp. 1-6, 2017, DOI: 10.1117/12.2268590
14. H. Bay, T. Tuytelaars, L. V. Gool. Surf: Speeded up robust features, European Conf. on Computer Vision (ECCV), 2006, pp. 404-417
15. M. Calonder, V. Lepetit, C. Strecha, P. Fua. BRIEF: Binary Robust Independent Elementary Features, 11th European Conf. on Computer Vision (ECCV), 2010
16. Fischler M.A., Bolles R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography, Communications of the ACM, 24(6), 1981, pp. 381-395
17. M. Everingham, L.V. Gool, C. Williams, J. Winn, and A. Zisserman. “The PASCAL visual object classes (VOC) challenge” IJCV, vol. 88, no. 2, 2010, pp. 303–338
 

2024-74-1
2023-73-4
2023-73-3
2023-73-2

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".