Methods and models of system analysis
Dynamic systems
Computer analysis of texts
A. V. Gayer "Context-independent fast text detection method for recognizing phone numbers"
Community informatics and the formation of social networking
Recognition of images
Risk management and safety
A. V. Gayer "Context-independent fast text detection method for recognizing phone numbers"
Abstract. 

Modern methods for detecting text in images are based on computationally expensive deep learning models and require a large amount of training data, including real data. In the case of text retrieval in arbitrary scenarios, the process of collecting and annotating real data for training is extremely labor-intensive and expensive due to the high variability of possible scenes. This paper presents a new method for detecting text in arbitrary images, which does not require photographs of text in real scenes to be trained and can be trained on simple synthetic data in the form of strings. The proposed neural network model is 42 times smaller than the text detector in one of the best text recognition systems in terms of quality and speed, PaddleOCR (84 KB versus 3.6 MB), which makes it an excellent choice for mobile devices. The model was tested as part of a phone number recognition system, where with its help it was possible to achieve 80.35% of correctly recognized numbers.

Keywords: 

deep learning, object detection, image segmentation, text detection.


DOI: 10.14357/20790279240305 

EDN: HREWAU

PP. 39-47.

References

1. Arlazarov, V.L., Slavin, O.A.: Issues of recognition and verification of text documents. ITiVS 3, 55–61 (2023), dOI: 10.14357/20718632230306.
2. Bulatov, K.B., Emelyanova, E.V., Tropin, D.V., Skoryukina, N.S., Chernyshova, Y.S., Sheshkus, A.V., Usilin, S.A., Ming, Z., Burie, J.C., Luqman, M.M., Arlazarov, V.V.: Midv-2020: A comprehensive benchmark dataset for identity document analysis. Computer Optics 46(2), 252–270 (2022), dOI: 10.18287/2412-6179-CO-1006.
3. Okun, O., Yan, Y., Pietikainen, M.: Robust text detection from binarized document images. In: 2002 International Conference on Pattern Recognition. vol. 3, pp. 61–64 vol.3 (2002). https://doi.org/10.1109/ICPR.2002.1047795.
4. Diem, M., Kleber, F., Sablatnig, R.: Text line detection for heterogeneous documents. In: 2013 12th International Conference on Document Analysis and Recognition. pp. 743–747 (2013). https://doi.org/10.1109/ICDAR.2013.152. 
5. dos Santos, R.P., Clemente, G.S., Ren, T.I., Cavalcanti, G.D.: Text line segmentation based on morphology and histogram projection. In: 2009 10th International Conference on Document Analysis and Recognition. pp. 651–655 (2009). https://doi.org/10.1109/ICDAR.2009.183.
6. Gatos, B., Papamarkos, N., Chamzas, C.: Skew detection and text line position de- termination in digitized documents. Pattern Recognition 30(9), 1505–1519 (1997). https://doi.org/https://doi.org/10.1016/S0031-3203(96)00157-4.
7. T. Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature Pyramid Networks for Object Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 936-944, doi: 10.1109/CVPR.2017.106.
8. Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 9357–9366 (06 2019), dOI: 10.1109/CVPR.2019.00959. 
9. Chen, Z., Wang, J., Wang, W., Chen, G., Xie, E., Luo, P., Lu, T.: Fast: Faster arbitrarily-shaped text detector with minimalist kernel representation. In: arXiv (2021), 2111.02394.
10. Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. Proceedings of the AAAI Conference on Artificial Intelligence 34(07), 11474–11481 (Apr 2020). https://doi.org/10.1609/aaai.v34i07.6812. 
11. Liao, M., Zou, Z., Wan, Z., Yao, C., Bai, X.: Real-time scene text detection with differentiable binarization and adaptive scale fusion. arXiv (2022), 2202.10304.
12. Zhang, S.X., Zhu, X., Yang, C., Yin, X.C.: Arbitrary shape text detection via boundary transformer. IEEE Transactions on Multimedia 26, 1747–1760 (2022), https://api.semanticscholar.org/CorpusID:248693243. 
13. Bu, Q., Park, S., Khang, M., & Cheng, Y. (2024). SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 855-863. https://doi.org/10.1609/aaai.v38i2.27844. 
14. Ye, M., Zhang, J., Zhao, S., Liu, J., Du, B., Tao, D.: Dptext-detr: towards better scene text detection with dynamic points in transformer. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI’23/IAAI’23/EAAI’23, AAAI Press (2023). https://doi.org/10.1609/aaai.v37i3.25430, https://doi.org/10.1609/aaai.v37i3.25430.
15. Li, C., Liu, W., Guo, R., Yin, X., Jiang, K., Du, Y., Du, Y., Zhu, L., Lai, B., Hu, X., Yu, D., Ma, Y.: Pp-ocrv3: More attempts for the improvement of ultra lightweight ocr system. ArXiv abs/2206.03001 (2022), https://api.semanticscholar.org/CorpusID:249431435. 
16. Layek, A.K., Mandal, S., Ghosh, S. (2020). A Fast Approach for Text Region Detection from Images on Online Social Media. In: Das, A., Nayak, J., Naik, B., Pati, S., Pelusi, D. (eds) Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing, vol 999. Springer, Singapore. https://doi.org/10.1007/978-981-13-9042-5_31.
17. A. V. Gayer, A. V. Sheshkus and Y. S. Chernyshova, “Augmentation on the fly for the neural networks learning,” Trudy ISA RAN (Proceedings of ISA RAS), vol. 68, Спецвыпуск № S1, pp. 150-157, 2018, DOI: 10.14357/20790279180517.
18. A. V. Trusov, E. E. Limonova, D. P. Nikolaev and V. V. Arlazarov, “p-im2col: Simple Yet Efficient Convolution Algorithm with Flexibly Controlled Memory Overhead,” IEEE Access, vol. 9, pp. 168162-168184, 2021, DOI: 10.1109/ACCESS.2021.3135690.

2024-74-4
2024-74-3
2024-74-2
2024-74-1

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".