The Design and Implementation of an Intelligent Guide Dog Robot Based on Multimodal Perception

  • Yanxuan Zhu Nanjing Jinling Middle School, Nanjing 210005, Jiangsu, China
Keywords: Quadruped robot, Guide system, Multimodal perception, Target detection, Human-robot interaction, Path planning

Abstract

Aiming at the problems of traditional guide devices such as single environmental perception and poor terrain adaptability, this paper proposes an intelligent guide system based on a quadruped robot platform. Data fusion between millimeter-wave radar (with an accuracy of ± 0.1°) and an RGB-D camera is achieved through multi-sensor spatiotemporal registration technology, and a dataset suitable for guide dog robots is constructed. For the application scenario of edge-end guide dog robots, a lightweight CA-YOLOv11 target detection model integrated with an attention mechanism is innovatively adopted, achieving a comprehensive recognition accuracy of 95.8% in complex scenarios, which is 2.2% higher than that of the benchmark YOLOv11 network. The system supports navigation on complex terrains such as stairs (25 cm steps) and slopes (35° gradient), and the response time to sudden disturbances is shortened to 100 ms. Actual tests show that the navigation success rate reaches 95% in eight types of scenarios, the user satisfaction score is 4.8/5.0, and the cost is 50% lower than that of traditional guide dogs.

References

Hutter M, Gehring C, Lauber A, et al., 2016, Anymal — A Highly Mobile and Dynamic Quadrupedal Robot, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea (South), 38–44.

Ultralytics, 2020, YOLOv5: A Family of Object Detection Architectures and Models. GitHub repository, viewed September 26, 2025, https://github.com/ultralytics/yolov5

China Disabled Persons’ Federation, 2021, White Paper on Travel Needs of Visually Impaired People.

China Disabled Persons’ Federation, 2023, White Paper on the Development of Guide Dog Services.

Zhang SS, 2020, Problem Analysis of Lithium-Ion Battery in Low Temperature. Energies, 13(3): 514.

Boston Dynamics, 2022, Spot® Robot Technical Specifications, viewed September 26, 2025, Available from: https://www.bostondynamics.com/products/spot

World Health Organization, 2019, World Report on vision, World Health Organization, Geneva.

Li Y, Li S, Liu X, et al., 2022, DSFusion: Dempster-Shafer Fusion for Robust Perception in Unstructured Environments. Robotics and Autonomous Systems, 148: 103912.

Macenski S, Martin F, White R, et al., 2020, The Marathon 2: A Navigation System, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 2718–2725.

Han K, Wang Y, Tian Q, et al., 2020, GhostNet: More Features from Cheap Operations, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 1577–1586.

Radford A, Kim JW, Xu T, et al., 2023, Robust Speech Recognition via Large-Scale Weak Supervision, Proceedings of the 40th International Conference on Machine Learning (ICML), 28492–28504.

Pacchierotti C, Sinclair S, Solazzi M, et al., 2017, Wearable Haptic Systems for the Fingertip and the Hand. IEEE Transactions on Haptics, 10(4): 580–600.

Kim T, Park S, Lee J, 2021, Solar Charging Optimization for Mobile Robots Using Model Predictive Control. Renewable Energy, 179: 398–410.

Wang K, Zhang H, Chen X, et al., 2023, LiDAR-Vision Fusion Based 3D Object Detection in Dynamic Environments. IEEE Sensors Journal, 23(5): 5303–5314.

Guizzo E, 2023, The Rise of the Robot Guide Dogs, IEEE Spectrum.

Published
2025-10-21