IMAGE PROCESSING METHODS
PATTERN RECOGNITION
M. G. Lobanov, D. L. Sholomov On the Acceleration of the Convolutional Neural Network Architecture Based on Resnet in the Task of Road Scene Objects Recognition
MATHEMATICAL MODELING
M. G. Lobanov, D. L. Sholomov On the Acceleration of the Convolutional Neural Network Architecture Based on Resnet in the Task of Road Scene Objects Recognition

Abstract.

Recent approaches to road scene objects detection based on convolutional neural networks have reached an acceptable level to be used in autonomous vehicle control and ADAS systems. However, the best modern network architectures are rather heavy and cannot be integrated in real-time systems. Thus the most actual problem is to accelerate networks and to find the optimal balance between their quality and performance. This paper proposes a method to facilitate the architecture of the Deformable Convolutional Network based on ResNet backbone that provides a threefold increase in the inference performance. At the same time, the quality of detection of road scene objects is reduced not so significantly. In addition, the paper compares the quality of the network of this architecture trained on different open datasets – BDD and MS-COCO.

Keywords:

object detection, road scene objects, deformable convolutional network, ResNet, ADAS, convolutional network acceleration, BDD, MS-COCO, pedestrian detection, vehicle detection.

PP. 57-65.

DOI 10.14357/20718632190305

Reference

1. Prun V.E., Postnikov V.V., Sadekov R.N., Sholomov D.L. “Development of Active Safety Software of Road Freight Transport, Aimed at Improving Inter-City Road Safety, Based on Stereo Vision Technologies and Road Scene Analysis” // Proceedings of the Scientific-Practical Conference “Research and Development – 2016”, Springer, Cham, pp.209-218. – 2017
2. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, 2014.
3. R. Girshick, “Fast R-CNN,” in ICCV, 2015
4. S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. TPAMI, 2016.
5. J. Dai, Y. Li, K. He, and J. Sun. R-fcn: Object detection via region-based fully convolutional networks. In NIPS, 2016.
6. J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. ICCV, 2017.
7. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick. Microsoft COCO: Common objects in context. In ECCV. 2014.
8. F. Yu, W. Xian, Y. Chen, F. Liu, M. Liao, V. Madhavan, and T. Darrell. BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling. ArXiv e-prints, 2018.
9. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
10. G. Neuhold, T. Ollmann, S. R. Bulo, and P. Kontschieder, “The mapillary vistas dataset for semantic understanding of street scenes,” in Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 2017, pp. 22–29.
11. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
12. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, 2009.
13. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
14. Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, 1994
15. X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, 2010.
16. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML. (2010)
17. K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In European Conference on Computer Vision, pages 630–645. Springer, 2016.
 

 

2024 / 03
2024 / 02
2024 / 01
2023 / 04

© ФИЦ ИУ РАН 2008-2018. Создание сайта "РосИнтернет технологии".