Abstract. The paper is devoted to the task of detecting the position of a document in a video stream received from a mobile device. Particular attention is paid to the methods of integrating the positions of the document obtained on a sequence of frames. The paper describes an algorithm based on the Kalman filter for selecting the document positions for a set of provided alternatives, their integration and refinement in the video stream. The analysis of the performance of the algorithm on the dataset provided in the of the ICDAR’15 competition on detection of documents from the smartphone is given. Keywords: document detection, video stream, integration, projective transformation, mobile cameras, Kalman filter. PP. 15-22. DOI: 10.14357/20790279180502 References 1. V.V. Arlazarov, A.E. Zhukovsky, V.E. Krivtsov, D.P. Nikolaev, D.V. Polevoy. Analiz osobennostey ispol’zovaniya statsionarnykh i mobil’nykh malorazmernykh tsifrovykh video kamer dlya raspoznavaniya dokumentov, [Analysis of specific character of usage fixed and mobile smallsize video cameras for document recognition], Informatsionnyye tekhnologii i vychislitel’nyye sistemy [Information Technologies and computing systems], Vol. 3, 2014, pp. 71-81 2. J. Liang, D. Doermann, H.Li. Camera-based analysis of text and documents: a survey, Int. J. of Document Analysis and Recognition, vol. 7, Issue 2, 2005, pp. 84-104 3. D. Doermann, J. Liang, H. Li. Progress in Camera-Based Document Image Analysis, IEEE Proc. 7th Int. Conf. on Document Analysis and Recognition, Vol.1, 2003, pp. 606-616 4. K. Bulatov, V.V. Arlazarov, T. Chernov, O. Slavin and D. Nikolaev. Smart IDReader: Document Recognition in Video Stream, 2017 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), 2017, pp. 39-44. doi: 10.1109/ICDAR.2017.347 5. Burie JC., Chazalon J., Coustaty M., Eskenazi S., Luqman M.M., Mehri M., Nayef N., Ogier JM., Prum S., Rusinol M. ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc), 13th Int. Conf. on Document Analysis and Recognition. 2015 6. V.V. Arlazarov, A.E. Zhukovsky, V.E. Krivtsov, V.V. Postnikv. Ispol’zovaniye grafa peresecheniy v zadache obnaruzheniya dokumenta na izobrazhenii, poluchennom so smartfona [Usage of the intersection graph in the task of camerabased document detection.] Ispol’zovaniye grafa peresecheniy v zadache obnaruzheniya dokumenta na izobrazhenii, poluchennom so smartfona, Iskusstvennyy Intellekt i Prinyatiye Resheniy [Artificial Intelligence and Decision Making], vol. 2, pp. 60-69, 2016 7. A. Zhukovsky et al. “Segments Graph-Based Approach for Document Capture in a Smartphone Video Stream,” 2017 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), 2017, pp. 337-342, doi: 10.1109/ICDAR.2017.63 8. Skoryukina N, Shemyakina Y., Arlazarov V.L., Faradjev I. Document localization algorithms based on feature points and straight lines, Proc. SPIE 10696, 10th Int. Conf. on Machine Vision (ICMV 2017), pp. 1-8, 2018, DOI: 10.1117/12.2311478 9. T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein. Introduction to Algorithms (second ed.). MIT Press and McGraw-Hill. ISBN 978-0-262-53196-2., 2001 10. R.E. Kalman. A New Approach to Linear Filtering and Prediction Problems, J. of Basic Engineering 82, 35, 1960. 11. R. Hartley, A. Zisserman. Multiple view geometry in computer vision, Cambridge University Press, New York, 2003 12. Y.A. Shemyakina, A.E. Zhukovsky, I.A. Faradjev. Issledovaniye algoritmov vychisleniya proyektivnogo preobrazovaniya v zadache navedeniya na planarnyy ob”yekt po osobym tochkam [Investigation of algorithms for calculating a projective transformation in the problem of targeting to a planar object from feature points], Iskusstvennyy Intellekt i Prinyatiye Resheniy [Artificial Intelligence and Decision Making], vol. 1, 2017, pp. 43-49 13. Y. Shemyakina, A. Zhukovsky, I. Faradjev. The Calculation of a Projective Transformation in the Problem of Planar Object Targeting by Feature Points, Proc. SPIE 10341, ICMV 2016, 10341 ed., 9th Int. Conf. on Machine Vision, 2017, vol. 10341, pp. 1-6, 2017, DOI: 10.1117/12.2268590 14. H. Bay, T. Tuytelaars, L. V. Gool. Surf: Speeded up robust features, European Conf. on Computer Vision (ECCV), 2006, pp. 404-417 15. M. Calonder, V. Lepetit, C. Strecha, P. Fua. BRIEF: Binary Robust Independent Elementary Features, 11th European Conf. on Computer Vision (ECCV), 2010 16. Fischler M.A., Bolles R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography, Communications of the ACM, 24(6), 1981, pp. 381-395 17. M. Everingham, L.V. Gool, C. Williams, J. Winn, and A. Zisserman. “The PASCAL visual object classes (VOC) challenge” IJCV, vol. 88, no. 2, 2010, pp. 303–338
|