The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)
In early training phase of robot path planning, deep reinforcement learning will cause reward difficult to obtain. To reduce training time, an intrinsic curiosity deep deterministic strategy gradient (ICDDPG) algorithm is proposed on end-to-end robot path planning of continuous action output. Environment information of perception as input, the output is robot motion (linear velocity and angular velocity) continuous control. Train and validate in the Gazebo simulation platform. The simulation results show ICDDPG is helpful to solve the problem of reward difficult to obtain, and the proposed algorithm has better control strategy compared with deep Q-learning networks. It is verified in a real environment, and the proposed algorithm can successfully reach the target points under static and dynamic obstacles.
张永梅,赵家瑞,吴爱燕. 好奇心驱动的深度强化学习机器人路径规划算法[J]. 科学技术与工程, 2022, 22(25): 11075-11083.
ZHANG Yongmei, ZHAO Jiarui, WU Aiyan. A Robot Path Planning Algorithm Based on Curiosity-driven Deep Reinforcement Learning[J]. Science Technology and Engineering,2022,22(25):11075-11083.