کنترل راه رفتن ربات انسان ‌نمای با پنجه فعال به کمک یادگیری تقویتی

توانگر, عارف; ساده دل, مجید

doi:10.30506/ijmep.2024.562717.1900

کنترل راه رفتن ربات انسان ‌نمای با پنجه فعال به کمک یادگیری تقویتی

نوع مقاله : مقاله علمی پژوهشی

نویسندگان

عارف توانگر ¹

مجید ساده دل ²

¹ کارشناسی ارشد، دانشکده مهندسی مکانیک، دانشگاه تربیت مدرس، تهران، جمهوری اسلامی ایران

² استادیار، دانشکده مهندسی مکانیک، دانشگاه تربیت مدرس، تهران، جمهوری اسلامی ایران

10.30506/ijmep.2024.562717.1900

چکیده

کاربرد کنترل کننده‌ های مبتنی بر هوش مصنوعی در رباتیک نتایج درخشانی را به دنبال داشته است. از بین روش‌ های مبتنی بر هوش مصنوعی، روش‌ های یادگیری تقویتی بیشترین سهم استفاده را به خود اختصاص داده ‌اند. با وجود مزایای وجود پنجه فعال، ربات ‌های انسان‌نمای بسیار پیشرفته بدون پنجه فعال به کمک یادگیری تقویتی کنترل شده‌ا ند. در این پژوهش دو الگوریتم DDPG و TD3 بر روی ربات انسان ‌نمای با پنجه به کار گرفته شده و با یکدیگر مقایسه شده ‌اند. کارایی چهارچوب طراحی شده در کنترل ربات انسان‌ نمای با پنجه، به کمک شبیه‌ سازی سنجیده شده و ربات انسان ‌نما توانسته با سرعت ۹/۰ متر بر ثانیه مسیر صاف را طی نماید.

کلیدواژه‌ها

ربات انسان‌ نما

مفصل پنجه فعال

یادگیری راه رفتن

یادگیری تقویتی عمیق

موضوعات

طراحی و کنترل ربات ها و مکانیزم ها

[1] R. S. Sutton, and A. G. Barto, "Reinforcement Learning: An Introduction", MIT Press, 2018, https://mitpress.mit.edu/9780262039246/reinforcement-learning/.

[2] L. T. Russell, "Applied Optimal Control for Dynamically Stable Legged Locomotion," PhD Thesis, Massachusetts Institute of Technology, 2004, https://dspace.mit.edu/handle/1721.1/28742.

[3] Y. Wu, D. Yao, X. Xiao, and Z. Guo, "Intelligent Controller for Passivity-based Biped Robot using Deep Q Network," Journal of Intelligent & Fuzzy Systems, Vol. 36, No. 1, pp. 731-745, 2019, doi: https://doi.org/10.3233/JIFS-172180.

[4] T. P. Lillicap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous Control with Deep Reinforcement Learning," arXiv Preprint arXiv:1509.02971, 2015, doi: https://doi.org/10.48550/arXiv.1509.02971.

[5] S. Fujimoto, H. Hoof, and D. Meger, "Addressing Function Approximation Error in Actor-critic Methods," arXiv Preprint arXiv. 1802.09477, 2018, doi: https://doi.org/10.48550/arXiv.1802.09477.

[6] J. García, and D. Shafie, "Teaching a Humanoid Robot to Walk Faster through Safe Reinforcement Learning," Engineering Applications of Artificial Intelligence, Vol. 88, p. 103360, 2020, doi: https://doi.org/10.1016/j.engappai.2019.103360.

[7] L. C. Melo, D. C. Melo, and M. R. Maximo, "Learning Humanoid Robot Running Motions with Symmetry Incentive through Proximal Policy Optimization," Journal of Intelligent & Robotic Systems, Vol. 102, No. 3, pp. 1-15, 2021, doi: https://doi.org/10.1007/s10846-021-01355-9.

[8] M. Sadedel, A. Yousefi Koma, and F. Iranmanesh, "Heel-off and Toe-off Motions Optimization for A2d Humanoid Robot Equipped with Active Toe Joints," Modares Mechanical Engineering, Vol. 16, No. 3, pp. 87-97, 2016, http://mme.modares.ac.ir/article-15-8715-en.html

[9] M. Sadedel, A. Yousefi-Koma, M. Khadiv, and F. Iranmanesh, "Heel-strike and Toe-off Motions Optimization for Humanoid Robots Equipped with Active Toe Joints," Robotica, Vol. 36, No. 6, pp. 925-944, 2018, doi: https://doi.org/10.1017/S0263574718000140.

[10] M. Sadedel, A. Yousefi-Koma, M. Khadiv, and M. Mahdavian, "Adding Low-cost Passive Toe Joints to the Feet Structure of SURENA III Humanoid Robot," Robotica, Vol. 35, No. 11, pp. 2099-2121, 2017, doi: https://doi.org/10.1017/S026357471600059X.

[11] M. Spitznagel, D. Weiler, and K. Dorer, “Deep Reinforcement Multi-directional Kick-learning of a Simulated Robot with Toes,” in 2021 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 28-29 April, Santa Maria da Feira, Portugal, pp. 104–110, Apr. 2021, doi: 10.1109/ICARSC52212.2021.9429811.

[12] J. Fischer, and K. Dorer, "Learning a Walk Behavior Utilizing Toes from Scratch," Robocup.info, Jul. 22, 2019, Available: https://archive.robocup.info/Soccer/Simulation/3D/FCPs/RoboCup/2019/magmaOffenburg_SS3D_RC2019_FCP.pdf. [Accessed: Nov. 01, 2019].

[13] A. Duburcq, F. Schramm, G. Boéris, N. Bredeche, and Y. Chevaleyre, "Reactive Stepping for Humanoid Robots using Reinforcement Learning: Application to Standing Push Recovery on the Exoskeleton Atalante," arXiv Preprint arXiv: 2203.01148, 2022, doi: https://doi.org/10.48550/arXiv.2203.01148.

[14] H. Kim, D. Seo, and D. Kim, "Push Recovery Control for Humanoid Robot using Reinforcement Learning," in 2019 Third IEEE International Conference on Robotic Computing (IRC), 2019: IEEE, 25-27 February, Naples, Italy, pp. 488-492, doi: https://doi.org/10.1109/IRC.2019.00102.

[15] G. Bingjing, H. Jianhai, L. Xiangpan, and Y. Lin, "Human–robot Interactive Control Based on Reinforcement Learning for Gait Rehabilitation Training Robot," International Journal of Advanced Robotic Systems, Vol. 16, No. 2, p. 1729881419839584, 2019, doi: https://doi.org/10.1177/1729881419839584.

[16] A. Ehsaniseresht, and M. M. Moghaddam, "A New Ground Contact Model for the Simulation of Bipeds' Walking, Running and Jumping," in 2015 3rd RSI International Conference on Robotics and Mechatronics (ICROM), 2015: IEEE, 07-09 October, Tehran, Iran, pp. 535-538, doi: https://doi.org/10.1109/ICRoM.2015.7367840.

[17] M. S. Shourijeh, and J. McPhee, "Foot-gound Contact Modeling within Human Gait Simulations: from Kelvin-Voigt to Hyper-volumetric Models," Multibody System Dynamics, Vol. 35, No. 4, pp. 393-407, 2015, doi: https://doi.org/10.1007/s11044-015-9467-6.

[18] N. Heess, D. TB, S. Sriram, J. Lemmon, J. Merel, G. Wayne, Y. Tassa, T. Erez, Z. Wang, S. M. Ali Eslami, M. Riedmiller, and D. Silver, "Emergence of Locomotion Behaviours in Rich Environments," arXiv Preprint arXiv: 1707.02286, 2017, doi: https://doi.org/10.48550/arXiv.1707.02286.

دوره 26، شماره 3 - شماره پیاپی 76
پاییز 1403
صفحه 29-44

XML

اصل مقاله 1.13 M

تاریخ دریافت 18 مهر 1401
تاریخ بازنگری 25 دی 1402
تاریخ پذیرش 13 اسفند 1402

تعداد مشاهده مقاله 2,198
تعداد دریافت فایل اصل مقاله 854

نشریه مهندسی مکانیک ایران

کنترل راه رفتن ربات انسان ‌نمای با پنجه فعال به کمک یادگیری تقویتی

دوره 26، شماره 3 - شماره پیاپی 76پاییز 1403صفحه 29-44

فایل ها

سابقه مقاله

هم رسانی

ارجاع به این مقاله

آمار

دوره 26، شماره 3 - شماره پیاپی 76
پاییز 1403
صفحه 29-44