Applications of Reinforcement Learning on Humanoid Robot Controlling
DOI:
https://doi.org/10.54097/qm1n5316Keywords:
Reinforcement learning; humanoid robot controlling; applications.Abstract
Amidst the swift advancement of artificial intelligence and machine learning technologies, humanoid robots are increasingly demonstrating their potential in mimicking human motions, adapting to intricate environments, and executing a wide array of tasks. Reinforcement Learning (RL), as an advanced learning paradigm that optimizes decision-making processes through environmental interaction, has emerged as a pivotal tool in enhancing the capabilities of humanoid robots. In this paper, the researcher delves into a variety of RL techniques and their implementations, spotlighting key accomplishments and addressing the prevailing challenges alongside envisaging future trajectories. Through an in-depth examination of these applications, our aim is to elucidate the transformative potential of RL in the domain of humanoid robotics, paving the way for more adaptive, intelligent, and autonomous systems. This paper posits that, with the deepening of research and technological progress, RL will catalyze breakthroughs in the humanoid robot sector, propelling smart robots towards greater integration within human society.
Downloads
References
[1] Christopher John Cornish Hellins Watkins and Peter Dayan. Q-learning. In: Machine Learning 8.3 1992, 279–292. DOI: https://doi.org/10.1023/A:1022676722315
[2] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing Atari with Deep Reinforcement Learning. 2013. arXiv: 1312.5602.
[3] Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. Policy gradient methods for reinforcement learn- ing with function approximation. In: Advances in neural information pro- cessing systems 12, 1999.
[4] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal Policy Optimization Algorithms. 2017. arXiv: 1707.06347 .
[5] John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel. Trust Region Policy Optimization. 2017. arXiv: 1502.05477 .
[6] Carlos Eduardo Garcia, David M. Prett, and Manfred Morari. Model predictive control: Theory and practice—A survey. In: Automatica 25.3,1989, 335–348. DOI: https://doi.org/10.1016/0005-1098(89)90002-2
[7] Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. 2019. arXiv: 1509.02971.
[8] Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous Methods for Deep Reinforcement Learning. 2016. arXiv: 1602.01783 [cs.LG].
[9] Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. 2018. arXiv: 1801.01290 [cs.LG].
[10] Eric Vollenweider, Jose Rojas, Anastasia Mavrommati, and Katja Mombaur. Advanced Skills through Multiple Adversarial Mo- tion Priors in Reinforcement Learning. 2022. arXiv: 2203.14912 [cs.RO].
[11] Guillermo A. Castillo, Huiyan Chen, Xuxin Cheng, and Zhenglong Sun. Template Model Inspired Task Space Learning for Robust Bipedal Locomotion. 2023. arXiv: 2309.15442 [cs.RO]. DOI: https://doi.org/10.1109/IROS55552.2023.10341263
[12] Sumeet Batra, Rohan Pratap Singh, Jack Rae, and Emile Contal. Proximal Policy Gradient Arborescence for Quality Di- versity Reinforcement Learning. 2024. arXiv: 2305.13795 [cs.LG].
[13] Rohan Pratap Singh, Xuxin Cheng, Guillermo A. Castillo, and Zhenglong Sun. Learning Bipedal Walking on Planned Footsteps for Humanoid Robots. 2022. arXiv: 2207.12644 [cs.RO].
[14] Rohan Pratap Singh, Xuxin Cheng, Guillermo A. Castillo, and Zhenglong Sun. Learning Bipedal Walking for Humanoids with Current Feedback. In: IEEE Access 11 (2023), 82013–82023. issn: 2169-3536. DOI: https://doi.org/10.1109/ACCESS.2023.3301175
[15] Ilija Radosavovic, Rohan Pratap Singh, Xuxin Cheng, and Guillermo A. Castillo. Real-World Humanoid Locomotion with Reinforce- ment Learning. 2023. arXiv: 2303.03381 [cs.RO].
[16] Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, and Sergey Levine. Learning agile soccer skills for a bipedal robot with deep reinforcement learning. In: Science Robotics 9.89 (Apr. 2024). issn: 2470-9476.
[17] Annan Tang, Xuxin Cheng, Guillermo A. Castillo, and Zhenglong Sun. HumanMimic: Learning Natural Locomotion and Transitions for Humanoid Robot via Wasserstein Adversarial Imitation. In: 2024 IEEE International Conference on Robotics and Automation (ICRA). Vol. 30. IEEE, May 2024, 13107–13114. DOI: https://doi.org/10.1109/ICRA57147.2024.10610449
[18] Xuxin Cheng, Guillermo A. Castillo, Rohan Pratap Singh, and Zhenglong Sun. Expressive Whole-Body Control for Humanoid Robots. 2024. arXiv: 2402.16796 [cs.RO]. DOI: https://doi.org/10.15607/RSS.2024.XX.107
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







