Gait Control of Humanoid Robot with Toe Joints Based on Reinforcement Learning

Tavangar, Aref; Sadedel, Majid

doi:10.30506/ijmep.2024.562717.1900

Gait Control of Humanoid Robot with Toe Joints Based on Reinforcement Learning

Authors

Aref Tavangar ¹

Majid Sadedel ²

¹ M.Sc., Department of Mechanical Engineering, Tarbiat Modares University, Tehran, Iran

² Assistant Professor, Department of Mechanical Engineering, Tarbiat Modares University, Tehran, Iran

10.30506/ijmep.2024.562717.1900

Abstract

Controlling a humanoid robot is a complicated task because it deals with a high degree of freedom, a non-holonomic and underactuated system. Many model-based control strategies have been implied on humanoid robots. Over time model-free and AI-based strategies have taken place. Among AI strategies, Reinforcement Learning has the largest share. Many complex systems have successfully controlled to perform complicated tasks such as jumping and running. Toe joints is almost missing in all of these systems and does not have the application it performs in humans. Toed robots can outperform, so implementing Reinforcement Learning algorithms on a humanoid with an active toe joint has been studied. Two algorithms, DDPG, and TD3 were applied and compared. A customized RL framework was designed to teach a humanoid to walk. Simulations showed that the task of controlling a humanoid to walk was accomplished. Learned robot was able to gait on a flat surface at the average speed of 0.9 m/s.

Keywords

Humanoid Robot

Active toe joint

Learn to walk

Deep reinforcement learning

Subjects

Design and control of robots and mechanisms

[1] R. S. Sutton, and A. G. Barto, "Reinforcement Learning: An Introduction", MIT Press, 2018, https://mitpress.mit.edu/9780262039246/reinforcement-learning/.

[2] L. T. Russell, "Applied Optimal Control for Dynamically Stable Legged Locomotion," PhD Thesis, Massachusetts Institute of Technology, 2004, https://dspace.mit.edu/handle/1721.1/28742.

[3] Y. Wu, D. Yao, X. Xiao, and Z. Guo, "Intelligent Controller for Passivity-based Biped Robot using Deep Q Network," Journal of Intelligent & Fuzzy Systems, Vol. 36, No. 1, pp. 731-745, 2019, doi: https://doi.org/10.3233/JIFS-172180.

[4] T. P. Lillicap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous Control with Deep Reinforcement Learning," arXiv Preprint arXiv:1509.02971, 2015, doi: https://doi.org/10.48550/arXiv.1509.02971.

[5] S. Fujimoto, H. Hoof, and D. Meger, "Addressing Function Approximation Error in Actor-critic Methods," arXiv Preprint arXiv. 1802.09477, 2018, doi: https://doi.org/10.48550/arXiv.1802.09477.

[6] J. García, and D. Shafie, "Teaching a Humanoid Robot to Walk Faster through Safe Reinforcement Learning," Engineering Applications of Artificial Intelligence, Vol. 88, p. 103360, 2020, doi: https://doi.org/10.1016/j.engappai.2019.103360.

[7] L. C. Melo, D. C. Melo, and M. R. Maximo, "Learning Humanoid Robot Running Motions with Symmetry Incentive through Proximal Policy Optimization," Journal of Intelligent & Robotic Systems, Vol. 102, No. 3, pp. 1-15, 2021, doi: https://doi.org/10.1007/s10846-021-01355-9.

[8] M. Sadedel, A. Yousefi Koma, and F. Iranmanesh, "Heel-off and Toe-off Motions Optimization for A2d Humanoid Robot Equipped with Active Toe Joints," Modares Mechanical Engineering, Vol. 16, No. 3, pp. 87-97, 2016, http://mme.modares.ac.ir/article-15-8715-en.html

[9] M. Sadedel, A. Yousefi-Koma, M. Khadiv, and F. Iranmanesh, "Heel-strike and Toe-off Motions Optimization for Humanoid Robots Equipped with Active Toe Joints," Robotica, Vol. 36, No. 6, pp. 925-944, 2018, doi: https://doi.org/10.1017/S0263574718000140.

[10] M. Sadedel, A. Yousefi-Koma, M. Khadiv, and M. Mahdavian, "Adding Low-cost Passive Toe Joints to the Feet Structure of SURENA III Humanoid Robot," Robotica, Vol. 35, No. 11, pp. 2099-2121, 2017, doi: https://doi.org/10.1017/S026357471600059X.

[11] M. Spitznagel, D. Weiler, and K. Dorer, “Deep Reinforcement Multi-directional Kick-learning of a Simulated Robot with Toes,” in 2021 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 28-29 April, Santa Maria da Feira, Portugal, pp. 104–110, Apr. 2021, doi: 10.1109/ICARSC52212.2021.9429811.

[12] J. Fischer, and K. Dorer, "Learning a Walk Behavior Utilizing Toes from Scratch," Robocup.info, Jul. 22, 2019, Available: https://archive.robocup.info/Soccer/Simulation/3D/FCPs/RoboCup/2019/magmaOffenburg_SS3D_RC2019_FCP.pdf. [Accessed: Nov. 01, 2019].

[13] A. Duburcq, F. Schramm, G. Boéris, N. Bredeche, and Y. Chevaleyre, "Reactive Stepping for Humanoid Robots using Reinforcement Learning: Application to Standing Push Recovery on the Exoskeleton Atalante," arXiv Preprint arXiv: 2203.01148, 2022, doi: https://doi.org/10.48550/arXiv.2203.01148.

[14] H. Kim, D. Seo, and D. Kim, "Push Recovery Control for Humanoid Robot using Reinforcement Learning," in 2019 Third IEEE International Conference on Robotic Computing (IRC), 2019: IEEE, 25-27 February, Naples, Italy, pp. 488-492, doi: https://doi.org/10.1109/IRC.2019.00102.

[15] G. Bingjing, H. Jianhai, L. Xiangpan, and Y. Lin, "Human–robot Interactive Control Based on Reinforcement Learning for Gait Rehabilitation Training Robot," International Journal of Advanced Robotic Systems, Vol. 16, No. 2, p. 1729881419839584, 2019, doi: https://doi.org/10.1177/1729881419839584.

[16] A. Ehsaniseresht, and M. M. Moghaddam, "A New Ground Contact Model for the Simulation of Bipeds' Walking, Running and Jumping," in 2015 3rd RSI International Conference on Robotics and Mechatronics (ICROM), 2015: IEEE, 07-09 October, Tehran, Iran, pp. 535-538, doi: https://doi.org/10.1109/ICRoM.2015.7367840.

[17] M. S. Shourijeh, and J. McPhee, "Foot-gound Contact Modeling within Human Gait Simulations: from Kelvin-Voigt to Hyper-volumetric Models," Multibody System Dynamics, Vol. 35, No. 4, pp. 393-407, 2015, doi: https://doi.org/10.1007/s11044-015-9467-6.

[18] N. Heess, D. TB, S. Sriram, J. Lemmon, J. Merel, G. Wayne, Y. Tassa, T. Erez, Z. Wang, S. M. Ali Eslami, M. Riedmiller, and D. Silver, "Emergence of Locomotion Behaviours in Rich Environments," arXiv Preprint arXiv: 1707.02286, 2017, doi: https://doi.org/10.48550/arXiv.1707.02286.

Iranian Journal of Mechanical Engineering Transactions of ISME

Volume 26, Issue 3 - Serial Number 76
Autumn 2024
Pages 29-44

XML

PDF 1.13 M

Receive Date 10 October 2022
Revise Date 15 January 2024
Accept Date 03 March 2024

Article View 2,193
PDF Download 850

Iranian Journal of Mechanical Engineering Transactions of ISME

Gait Control of Humanoid Robot with Toe Joints Based on Reinforcement Learning

Volume 26, Issue 3 - Serial Number 76Autumn 2024Pages 29-44

Files

History

Share

How to cite

Statistics

Volume 26, Issue 3 - Serial Number 76
Autumn 2024
Pages 29-44