基于深度强化学习的轴孔装配柔顺控制方法
DOI:
作者:
作者单位:

天津理工大学

作者简介:

通讯作者:

中图分类号:

TP181

基金项目:

国家自然科学基金(U1813208);国家自然科学(61873188);


Deep Reinforcement Learning Based Compliance Control Method for Peg-in-Hole Assembly
Author:
Affiliation:

Tianjin University of Technology

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在工业机器人装配过程中,获取环境中装配体接触状态,得出装配体状态空间参数,从而实现高精度工件组装是工业机器人协作系统实现精密装配的重要方式。在面对复杂环境、新任务、高精度等条件时,传统的零件配合方法需要在部署之前针对当前的物理环境和任务条件准备和调试大量的实验参数。本文通过采用一种无模型强化学习神经循环网络柔性控制方法训练机器人,该模型采用双延迟深度确定性策略梯度算法(Twin Delayed Deep Deterministic policy gradient algorithm,TD3),在深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法的基础上,引入双重网络和延迟更新,通过对比训练结果表明,TD3算法在处理连续动作空间问题上有更优秀的表现。其通过对接触状态的分析,达到对动作空间的映射学习,利用DRL算法提供轨迹决策策略与柔性控制器结合,对末端执行器上的零件进行位置控制,能够在实现机器人完成自主装配的同时起到辅助工件的空间位姿的调整的作用,且可以有效的减少工件因发生接触力过高而产生的精度损失。实验验证基于仿真环境下的机器人模组,具有物理效应,实验数据显示,通过设立不同对照组,成功率达到95%以上,且检测到接触力不超过20N,采用该控制方法法具有较好的鲁棒性和效率。

    Abstract:

    Within the realm of industrial robot assembly, acquiring the contact status of assembly components in the environment and deriving the parameters of the assembly component state space are pivotal methods for achieving precise workpiece assembly in collaborative industrial robot systems.Confronted with intricate environments, novel tasks, and heightened precision requirements, traditional methods of component fitting necessitate substantial preparation and debugging of experimental parameters tailored to the prevailing physical environment and task conditions prior to deployment.This paper proposes the utilization of a model-free, neural recurrent network-based flexible control approach to train robots. The model incorporates the Twin Delayed Deep Deterministic policy gradient algorithm (TD3), which introduces dual networks and delayed updates as extensions to the Deep Deterministic Policy Gradient (DDPG) algorithm. Through comparative analysis of training outcomes, it is evidenced that the TD3 algorithm outperforms in addressing challenges encountered within continuous action spaces. Through the analysis of contact status, it achieves mapping learning of the action space. By leveraging DRL algorithms, it combines trajectory decision-making strategies with flexible controllers to control the positioning of components on the end effector. This enables the robot to autonomously complete assembly tasks while assisting in adjusting the spatial pose of the workpiece, effectively reducing precision loss caused by excessive contact forces. Experimental verification conducted on a simulated robot module with realistic physical effects demonstrates a success rate of over 95% through the establishment of various control groups. Moreover, contact forces are detected to be within the range of 20N, indicating that this control method exhibits excellent robustness and efficiency.

    参考文献
    相似文献
    引证文献
引用本文

赵桐,孙启湲,刘振忠. 基于深度强化学习的轴孔装配柔顺控制方法[J]. 科学技术与工程, , ():

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-06-09
  • 最后修改日期:2023-06-15
  • 录用日期:2023-06-15
  • 在线发布日期:
  • 出版日期:
×
亟待确认版面费归属稿件,敬请作者关注