(50-2) 13 * << * >> * Russian * English * Contents * All issues

YOLOv10s based human victim detection in cluttered urban environments with a crawler type rescue robot
R. Farkhetdinov1, B. Abbyasov1, A. Eryomin1, A. Dobrokvashina1, M. Svinin2, E. Magid1,3

1Institute of Information Technology and Intelligent Systems, Kazan Federal University, Kremlevskaya St. 35, Kazan, 420008, Russian Federation;
2Graduate School of Information Science and Engineering, Ritsumeikan University, 2-150 Iwakura-cho, Ibaraki, 5678570, Osaka, Japan;
3School of Electronic Engineering, Tikhonov Moscow Institute of Electronics and Mathematics, HSE University, 34 Tallinskaya St., Moscow, 123592, Russian Federation

  Full text (PDF)

DOI: 10.18287/COJ1710

Article ID: 1710

Abstract:
Rescue robots are widely utilized in search and rescue operations to enhance operations’ efficiency. To reduce operators’ load a robot could perform some functions automatically, including victims’ detection. This paper introduces a robot operating system based victim detection framework for Servosila Engineer crawler rescue robot with four cameras. The victim detection algorithm employs video stream frames from a single camera and a trained YOLOv10s neural network that detects human body parts within a picture of a cluttered urban environment. To train the YOLOv10s model, a human body dataset of 15068 images was created by combining an existing dataset with a new dataset collected with the robot’s camera. The model was trained to detect a person and his/her body parts: a head, a hand, and a foot. The study analyzed an impact of a distance between the robot and a human victim in cluttered environments on detection accuracy. The algorithm showed acceptable performance in validation experiments with three human participants under artificial lighting conditions when the robot’s camera was positioned within 50 to 200 cm distance from a cluttered area. Within this distance, an Average Precision (AP) of 0.75, 0.91, and 0.73 was achieved for the head, hand, and foot classes respectively; the AP rapidly degraded with distance. The experiments showed that hand class objects were detected more reliably compared to other objects across all three intervals. Unlike prior approaches that employed high-end hardware or multiple cameras, our system achieved a reasonable accuracy using a single camera and low-power onboard computing.

Keywords:
computer vision, deep learning, human detection, mobile robot, search and rescue.

Citation:
Farkhetdinov R, Abbyasov B, Eryomin A, Dobrokvashina A, Svinin M, Magid E. YOLOv10s based human victim detection in cluttered urban environments with a crawler type rescue robot. Computer Optics 2026; 50(2): 1710. DOI: 10.18287/COJ1710.

References:

  1. Mavroulis S, Mavrouli M, Lekkas E, Tsakris A. Managing earthquake debris: Environmental issues, health impacts, and risk reduction measures. Environments 2023; 10(11): 192. DOI: 10.3390/environments10110192.
  2. AlAli ZT, Alabady SA. A survey of disaster management and SAR operations using sensors and supporting techniques. Int J Disaster Risk Reduct 2022; 82: 103295. DOI: 10.1016/j.ijdrr.2022.103295.
  3. Magid E, Pashkin A, Simakov N, Abbyasov B, Suthakorn J, Svinin M, Matsuno F. Artificial intelligence based framework for robotic search and rescue operations conducted jointly by international teams. In: Ronzhin A, Shishlakov V, eds. Proceedings of 14th International Conference on Electromechanics and Robotics "Zavalishin's Readings": ER(ZR) 2019, Kursk, Russia, 17-20 April 2019. Singapore: Springer Nature Singapore Pte Ltd; 2020: 15-26. DOI: 10.1007/978-981-13-9267-2_2.
  4. Messina E, Jacoff A. Performance standards for urban search and rescue robots. Proc SPIE 2006; 6230: 62301V. DOI: 10.1117/12.663320.
  5. Blackburn MR, Everett HR, Laird RT. After action report to the joint program office: Center for the robotic assisted search and rescue (CRASAR) related efforts at the world trade center. Technical Document 2002; 3141: 8.
  6. Jiao J, Wei H, Hu T, Hu X, Zhu Y, He Z, Wu J, Yu J, Xie X, Huang H, Geng R, Wang L, Liu M. Fusionportable: A multi-sensor campus-scene dataset for evaluation of localization and mapping accuracy on diverse platforms. IEEE/RSJ Int Conf on Intelligent Robots and Systems (IROS) 2022: 3851-3856. DOI: 10.1109/IROS47612.2022.9982119.
  7. Cheng M-M, Zhang Z, Lin W-Y, Torr P. BING: Binarized normed gradients for objectness estimation at 300fps. IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2014: 3286-3293. DOI: 10.1109/CVPR.2014.414.
  8. Myrzin V, Tsoy T, Bai Y, Svinin M, Magid E. Visual data processing framework for a skin-based human detection. In: Ronzhin A, Rigoll G, Meshcheryakov R, eds. Interactive collaborative robotics. 6th International Conference, ICR 2021. Cham: Springer Nature Switzerland AG; 2021: 138-149. DOI: 10.1007/978-3-030-87725-5_12.
  9. Dadwhal YS, Kumar S, Sardana HK. Data-driven skin detection in cluttered search and rescue environments. IEEE Sens J 2019; 20(7): 3697-3708. DOI: 10.1109/JSEN.2019.2959787.
  10. Zagitov A, Chebotareva E, Toschev A, Magid E. Comparative analysis of neural network models performance on low-power devices for a real-time object detection task. Computer Optics 2024; 48(2): 242-252. DOI: 10.18287/2412-6179-CO-1343.
  11. Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, Ding G. YOLOv10: Real-time end-to-end object detection. arXiv Preprint. 2024. Source: https://arxiv.org/abs/2405.14458. DOI: 10.48550/arXiv.2405.14458.
  12. Alif MAR, Hussain M. YOLOv1 to YOLOv10: A comprehensive review of YOLO variants and their application in the agricultural domain. arXiv Preprint. 2024. Source: https://arxiv.org/abs/2406.10139. DOI: 10.48550/arXiv.2406.10139.
  13. Abdulganeev R, Lavrenov R, Dobrokvashina A, Bai Y, Magid E. Autonomous door opening with a rescue robot. 10th Int Conf on Automation, Robotics and Applications (ICARA) 2024: 7-11. DOI: 10.1109/ICARA60736.2024.10552969.
  14. Cruz Ulloa C, Orbea D, del Cerro J, Barrientos A. Thermal, multispectral, and RGB vision systems analysis for victim detection in SAR robotics. Appl Sci 2024; 14(2): 766. DOI: 10.3390/app14020766.
  15. Zafar MH, Moosavi SKR, Sanfilippo F. Enhancing unmanned ground vehicle performance in SAR operations: integrated gesture-control and deep learning framework for optimised victim detection. Front Robot AI 2024; 11: 1356345. DOI: 10.3389/frobt.2024.1356345.
  16. Huang C-H, Chen Y-C, Hsu C-Y, Yang J-Y, Chang C-H. FPGA-based UAV and UGV for search and rescue applications: A case study. Comput Electr Eng 2024; 119(A): 109491. DOI: 10.1016/j.compeleceng.2024.109491.
  17. Louie W-YG, Nejat G. A victim identification methodology for rescue robots operating in cluttered USAR environments. Adv Robot 2013; 27(5): 373-384. DOI: 10.1080/01691864.2013.763743.
  18. De Cubber G, Doroftei D, Baudoin Y, Serrano D, Chintamani K, Sabino R, Ourevitch S. ICARUS: Providing unmanned search and rescue tools. 6th IARP Workshop on Risky Interventions and Environmental Surveillance (RISE) 2012.
  19. Cruz Ulloa C, Garcia M, del Cerro J, Barrientos A. Deep learning for victims detection from virtual and real search and rescue environments. In: Tardioli D, Matellán V, Heredia G, Silva MF, Marques L, eds. ROBOT2022: Fifth Iberian Robotics Conference. Advances in Robotics, Volume 2. Cham: Springer International Publishing; 2022: 3-13. DOI: 10.1007/978-3-031-21062-4_1.
  20. Morales J, Vázquez-Martín R, Mandow A, Morilla-Cabello D, García-Cerezo A. The UMA-SAR dataset: Multimodal data collection from a ground vehicle during outdoor disaster response training exercises. Int J Robotics Res 2021; 40(6-7): 835-847. DOI: 10.1177/02783649211004959.
  21. Kohlbrecher S, Kunz F, Koert D, Rose C, Manns P, Daun K, Schubert J, Stumpf A, von Stryk O. Towards highly reliable autonomy for urban search and rescue robots. In: Bianchi RAC, Akin HL, Ramamoorthy S, Sugiura K, eds. RoboCup 2014: Robot World Cup XVIII. Cham: Springer International Publishing; 2015: 118-129. DOI: 10.1007/978-3-319-18615-3_10.
  22. Rafael VM, Jose CS, Abel AH, Andres MA, Joseph GM, Jarelh GB, Jesus TS. Development of a low-cost teleoperated explorer robot (TXRob). Int J Adv Comput Sci Appl 2022; 13(7): 897-903. DOI: 10.14569/IJACSA.2022.01307104.
  23. Bahadori S, Iocchi L, Nardi D, Settembre GP. Stereo vision based human body detection from a localized mobile robot. IEEE Conf on Advanced Video and Signal Based Surveillance 2005: 499-504. DOI: 10.1109/AVSS.2005.1577319.
  24. Castillo C, Chang C. A method to detect victims in search and rescue operations using template matching. IEEE Int Safety, Security and Rescue Robotics Workshop 2005: 201-206. DOI: 10.1109/SSRR.2005.1501256.
  25. Kleiner A, Kummerle R. Genetic MRF model optimization for real-time victim detection in search and rescue. 2007 IEEE/RSJ Int Conf on Intelligent Robots and Systems 2007: 3025-3030. DOI: 10.1109/IROS.2007.4399006.
  26. Jacoff AS, Messina ER, Evans J. Experiences in deploying test arenas for mobile autonomous robots. Proc 2001 Performance Metrics for Intelligent Systems 2001: 1-8.
  27. Gabdrahmanov R, Tsoy T, Bai Y, Svinin MM, Magid E. Gear wheels based simulation of crawlers for mobile robot Servosila Engineer. 19th Int Conf on Informatics in Control, Automation and Robotics (ICINCO) 2022: 565-572. DOI: 10.5220/0011355200003271.
  28. Mavrin I, Lavrenov R, Svinin M, Sorokin S, Magid E. Remote control library and GUI development for Russian crawler robot Servosila Engineer. MATEC Web Conf 2018; 161: 03016. DOI: 10.1051/matecconf/201816103016.
  29. St-Onge D, Herath D. The robot operating system (ROS1 & 2): Programming paradigms and deployment. In: Herath D, St-Onge D, eds. Foundations of robotics: A multidisciplinary approach with Python and ROS. Singapore: Springer Nature Singapore Pte Ltd; 2022: 105-126. DOI: 10.1007/978-981-19-1983-1_5.
  30. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A. The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 2010; 88: 303-338. DOI: 10.1007/s11263-009-0275-4.
  31. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft COCO: Common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, eds. Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V. Cham: Springer International Publishing; 2014: 740-755. DOI: 10.1007/978-3-319-10602-1_48.
  32. Selcuk B, Serif T. A comparison of YOLOv5 and YOLOv8 in the context of mobile UI detection. Int Conf on Mobile Web and Intelligent Information Systems 2023: 161-174. DOI: 10.1007/978-3-031-39764-6_11.
  33. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. 2016 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2016: 779-788. DOI: 10.1109/CVPR.2016.91.
  34. Bernabé S, González C, Fernández A, Bhangale U. Portability and acceleration of deep learning inferences to detect rapid earthquake damage from VHR remote sensing images using Intel OpenVINO toolkit. IEEE J Sel Top Appl Earth Obs Remote Sens 2021; 14: 6906-6915. DOI: 10.1109/JSTARS.2021.3075961.
  35. Demidovskij A, Gorbachev Y, Fedorov M, Slavutin I, Tugarev A, Fatekhov M, Tarkan Y. OpenVINO deep learning workbench: Comprehensive analysis and tuning of neural networks inference. 2019 IEEE/CVF Int Conf on Computer Vision Workshops (ICCVW) 2019: 783-787. DOI: 10.1109/ICCVW.2019.00104.
  36. Farhadi A, Redmon J. YOLOv3: An incremental improvement. In: Computer vision and pattern recognition. Berlin, Heidelberg: Springer; 2018.
  37. Thévenaz P, Blu T, Unser M. Image interpolation and resampling. In: Bankman IN, ed. Handbook of medical imaging. San Diego: Academic Press Inc; 2000: 393-420. DOI: 10.1016/B978-012077790-7/50030-8.
  38. Hosang J, Benenson R, Schiele B. Learning non-maximum suppression. 2017 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2017: 6469-6477. DOI: 10.1109/CVPR.2017.685.
  39. Hosna A, Merry E, Gyalsomo J, Alom Z, Aung Z, Azim MA. Transfer learning: A friendly introduction. J Big Data 2022; 9: 102. DOI: 10.1186/s40537-022-00652-w.
  40. Andriyanov N, Papakostas G. Optimization and benchmarking of convolutional networks with quantization and OpenVINO in baggage image recognition. VIII IEEE Int Conf on Information Technology and Nanotechnology (ITNT) 2022: 1-4. DOI: 10.1109/ITNT55410.2022.9848757.
  41. Gao T, Suto J. Acceleration of image classification and object tracking by the Intel Neural Compute Stick 2 with power efficiency evaluation on Raspberry Pi 4B. Sensors 2025; 25(6): 1794. DOI: 10.3390/s25061794.
  42. Mao H, Yao S, Tang T, Li B, Yao J, Wang Y. Towards real-time object detection on embedded systems. IEEE Trans Emerg Top Comput 2018; 6(3): 417-431. DOI: 10.1109/TETC.2016.2593643.
  43. Coates A, Ng AY. Multi-camera object detection for robotics. IEEE Int Conf on Robotics and Automation 2010: 412-419. DOI: 10.1109/ROBOT.2010.5509644.
  44. Yu PK. The algorithmic divide and equality in the age of artificial intelligence. Fla L Rev 2020; 72(2): 331-389.
  45. Buolamwini J, Gebru T. Gender shades: Intersectional accuracy disparities in commercial gender classification. Proc Mach Learn Res 2018; 81: 77-91.

151, Molodogvardeyskaya str., Samara, 443001, Russia; E-mail: journal@computeroptics.ru; Tel: +7 (846) 242-41-24 (Executive secretary), +7 (846) 332-56-22 (Issuing editor), Fax: +7 (846) 332-56-20