计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (16): 143-149.DOI: 10.3778/j.issn.1002-8331.2302-0250

• 模式识别与人工智能 • 上一篇    下一篇

万向结构蛇形机器人的设计及控制策略研究

李亚鑫,逯云飞,何梓玮,周政辉   

  1. 西南石油大学 电气信息学院,成都 610500
  • 出版日期:2023-08-15 发布日期:2023-08-15

Research on Design and Control Strategy of Universal Joint Snake-Like Robot

LI Yaxin, LU Yunfei, HE Ziwei, ZHOU Zhenghui   

  1. School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu 610500, China
  • Online:2023-08-15 Published:2023-08-15

摘要: 为了解决蛇形机器人结构复杂、灵活性不足的问题,设计了一款十字轴式万向关节的蛇形机器人。该蛇形机器人由6个模块单元组成,每个模块上均带有被动轮,通过电机驱动滚珠丝杆上的滑块移动,使连杆带动万向关节偏转,以实现蜿蜒运动。不仅如此,万向节限位机构的多自由度,保证了蛇形机器人运动的灵活性。同时针对蛇形机器人建模复杂的难题,研究提出了一种基于深度强化学习的控制策略。通过MuJoCo物理引擎搭建出用于学习的交互环境,并采用近端策略优化算法(proximal policy optimization,PPO)训练出最优运动策略以指导机器人动作。使用所设计的机器人模型进行学习训练,仿真实验数据表明,采用PPO算法训练出的运动策略能够在不同摩擦系数的环境下完成直行前进的任务,机器人也具备对于不同的地形环境的适应性。最后通过实物实验验证了这一方案的可行性和稳定性。

关键词: 蛇形机器人, 万向结构, 强化学习, 近端策略优化算法(PPO)

Abstract: In order to solve the problems of complex structure and insufficient flexibility of the snake-like robot, a snake-like robot structure with cross shaft universal joint is proposed. The snake-like robot is composed of 6 modules, each module is equipped with a passive wheel, and the motor drives the slider on the ball screw to move, so that the connecting rod drives the universal joint to deflect, so as to realize the meandering movement and ensure the flexibility of the robot’s movement. At the same time, a control scheme based on deep reinforcement learning is presented for the complex problem of snake robot modeling. The MuJoCo physics engine is used to build an interactive environment for learning, and the proximal policy optimization(PPO) algorithm is adopted to train the optimal motion strategy to guide the action. After using the proposed model for learning and training, the simulation experimental data show that the motion strategy trained by the PPO algorithm can complete the straight forward motion in the environment of different friction coefficients, which showing that it has a certain adaptability for different terrain environments. Finally, the feasibility and stability of this scheme are verified by prototype physical test experiments.

Key words: snake-like robot, universal structure, reinforcement learning, proximal policy optimization(PPO)