Abstract:Within the realm of industrial robot assembly, acquiring the contact status of assembly components in the environment and deriving the parameters of the assembly component state space are pivotal methods for achieving precise workpiece assembly in collaborative industrial robot systems.Confronted with intricate environments, novel tasks, and heightened precision requirements, traditional methods of component fitting necessitate substantial preparation and debugging of experimental parameters tailored to the prevailing physical environment and task conditions prior to deployment.This paper proposes the utilization of a model-free, neural recurrent network-based flexible control approach to train robots. The model incorporates the Twin Delayed Deep Deterministic policy gradient algorithm (TD3), which introduces dual networks and delayed updates as extensions to the Deep Deterministic Policy Gradient (DDPG) algorithm. Through comparative analysis of training outcomes, it is evidenced that the TD3 algorithm outperforms in addressing challenges encountered within continuous action spaces. Through the analysis of contact status, it achieves mapping learning of the action space. By leveraging DRL algorithms, it combines trajectory decision-making strategies with flexible controllers to control the positioning of components on the end effector. This enables the robot to autonomously complete assembly tasks while assisting in adjusting the spatial pose of the workpiece, effectively reducing precision loss caused by excessive contact forces. Experimental verification conducted on a simulated robot module with realistic physical effects demonstrates a success rate of over 95% through the establishment of various control groups. Moreover, contact forces are detected to be within the range of 20N, indicating that this control method exhibits excellent robustness and efficiency.