ANALYSIS OF METHODS FOR DISTANCE ESTIMATION TO AN OBJECT FROM A SINGLE VIDEO CAMERA IMAGE USING NEURAL NETWORKS

Main Article Content

N. LUPENKO
R. BOHUSH
H. CHEN

Abstract

This paper discusses about any approaches to determining the distance to an object based on an image generated by a monocular video camera, which use artificial neural networks at various stages of processing. Method based on finding a depth map, detecting an object, and then projecting its coordinates onto the depth map is analyzed. It describes a method that uses the relationship between the real size of an object and its size in the image. It considers a method based on a modification of the YOLO, which allows expanding the resulting descriptor with an additional vector characterizing the distance to the object. Data sets used to train neural networks used in algorithms for calculating the absolute distance to an object based on an image is analyzed. The paper discusses about the effectiveness of the methods considered, their advantages and disadvantages, as well as the prospects for using them for practical solutions.

Article Details

How to Cite
LUPENKO, N., BOHUSH, R., & CHEN, H. (2024). ANALYSIS OF METHODS FOR DISTANCE ESTIMATION TO AN OBJECT FROM A SINGLE VIDEO CAMERA IMAGE USING NEURAL NETWORKS. Vestnik of Polotsk State University. Part C. Fundamental Sciences, (2), 24-33. https://doi.org/10.52928/2070-1624-2024-43-2-24-33
Author Biographies

R. BOHUSH, Euphrosyne Polotskaya State University of Polotsk

д-р техн. наук, доц.

H. CHEN, Zhejiang Shuren University, China

Ph. D.

References

Mal'cev, S. V., Ablamejko, S. V., & Bogush, R. P. (2011). Obrabotka signalov i izobrazhenij sredstvami vektorno-matrichnyh vychislenij [Processing of signals and images by means of vector-matrix calculations]. Novopolotsk: PSU. (In Russ.).

Rukhovich, D., Mouritzen, D., Kaestner, R., Rufli, M., & Velizhev A. (2019). Estimation of Absolute Scale in Monocular SLAM Using Synthetic Data. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) (803–812). IEEE. DOI: 10.1109/ICCVW.2019.00108.

Haseeb, M. A., Guan, J., Ristic-Durrant, D., & Gräser, A. (2018). DisNet: A novel method for distance estimation from monocular camera. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems: 10th Workshop on Planning, Perception and Navigation for Intelligent Vehicles (PPNIV). URL: https://project.inria.fr/ppniv18/files/2018/10/paper22.pdf.

Bogush, R. P., & Zaharova, I. Ju. (2020). Algoritm soprovozhdenija ljudej na videoposledovatel'nostjah s ispol'zovaniem svertochnyh nejronnyh setej dlja videonabljudenija vnutri pomeshhenij [Person tracking algorithm based on convolutional neural network for indoor video surveillance]. Komp'juternaja optika [Computer Optics], 40(1), 109–116. DOI: 10.18287/2412-6179-CO-565. (In Russ., abstr. in Engl.).

Chen, H., Ihnatsyeva, S. A., Bohush, R. P., & Ablameyko, S. V. (2023). Person Re-identification in Video Surveillance Systems Using Deep Learning: Analysis of the Existing Methods. Automation and Remote Control, 84(5), 497–528. DOI: 10.1134/S0005117923050041.

Masoumian, A., Marei, D. G. F., Abdulwahab, S. Cristiano J., Puig D., & Rashwan H. A. (2021). Absolute distance prediction based on deep learning object detection and monocular depth estimation models. Frontiers in Artificial Intelligence and Applications, 339, 325–334. DOI: 10.3233/FAIA210151.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (770–778). IEEE. DOI: 10.1109/CVPR.2016.90.

Taha, Z., & Jizat, J. A. M. (2012). A comparison of two approaches for collision avoidance of an automated guided vehicle using monocular vision. Applied Mechanics and Materials, 145, 547–551. DOI: 10.4028/www.scientific.net/AMM.145.547.

Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In IEEE Conference on Computer Vision and Pattern Recognition (3354–3361). DOI: 10.1109/CVPR.2012.6248074.

Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., … Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars (Eds.) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science: Vol. 8693 (740–755). Springer, Cham. DOI: 10.1007/978-3-319-10602-1_48.

Redmon, J., Farhadi, A. (2018). YOLOv3: An Incremental Improvement. ArXiv. DOI: 10.48550/arXiv.1804.02767.

Vajgl, M., Hurtik, P., Nejezchleba, T. (2022). Dist-YOLO: Fast Object Detection with Distance Estimation. Applied Sciences, 12(3), 1354. DOI: 10.3390/app12031354.

Hurtik, P., Molek, V., Hula, J., Vajgl, M., Vlasanek, P., & Nejezchleba, T. (2022). Poly-YOLO: Higher speed, more precise detection and instance segmentation for YOLOv3. Neural Computing and Applications, 34, 8275–8290. DOI: 10.1007/s00521-021-05978-9.

Most read articles by the same author(s)

1 2 > >>