Research on Path Planning of Warehouse Robots Based on Q-learning
DOI:
https://doi.org/10.54097/xn163725Keywords:
Q-learning; Path planning; Full-factor experiment.Abstract
The rapid growth of e-commerce has intensified the demand for efficient warehouse management and automation. This research focuses on applying Q-learning, a reinforcement learning algorithm, to the path planning of warehouse robots. Path planning is a critical aspect of warehouse robotics, particularly in dynamic environments where efficient navigation can significantly enhance operational productivity. This study aims to develop a Q-learning-based algorithm that enables warehouse robots to navigate their workspace effectively, minimizing the time taken to reach designated goals while avoiding obstacles. The approach involves constructing a grid world model to simulate the warehouse environment, where the robot learns to make optimal decisions based on rewards and penalties associated with reaching goals and colliding with obstacles, respectively. By modeling the warehouse as a discrete grid with obstacles and defining the robot's task cycle—from starting point to cargo pickup to delivery, and return—this study investigates how varying the learning rate (α) and discount factor (γ) affects the robot's learning efficiency and path optimization. A full-factor experiment testing 81 combinations of α and γ revealed that higher values of γ, especially when paired with appropriate α values, significantly reduce the number of steps required for task completion and decrease path-planning failures. The findings underscore the importance of parameter tuning in Q-learning algorithms to enhance the efficiency and reliability of autonomous warehouse robots.
Downloads
References
[1] Lee Ckm, Lv Y, Ng Kkh, Ho W, Choy Kl, Design and application of Internet of things-based warehouse management system for smart logistics. International Journal of Production Research, 2018, 56(8): 2753-2768.
[2] Wu J P. Application analysis and the prospect of Kiva robot in Amazon warehouse. Logistics Technology and Applications, 2015, 20(10): 159-164.
[3] Kumar N V, Kumar C S. Development of collision free path planning algorithm for warehouse mobile robot. Procedia computer science, 2018, 133: 456-463.
[4] Watkins C J C H, Dayan P. Q-learning. Machine learning, 1992, 8: 279-292.
[5] Tan L, Zhang H, Liu Y, et al. An adaptive Q-learning based particle swarm optimization for multi-UAV path planning. Soft Computing, 2024, 28(13): 7931-7946.
[6] Wen X, Zhang H, Li H, et al. Fusion q-learning algorithm for open shop scheduling problem with AGVs. Mathematics, 2024, 12(3): 452.
[7] Lemos M R, de Souza A V R, de Lira R S, et al. Robot training and navigation through the deep Q-Learning algorithm. 2021 IEEE International Conference on Consumer Electronics (ICCE). IEEE, 2021: 1-6.
[8] Maoudj A, Christensen A L. Q-learning-based navigation for mobile robots in continuous and dynamic environments. 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE). IEEE, 2021: 1338-1345.
[9] Peyas I S, Hasan Z, Tushar M R R, et al. Autonomous warehouse robot using deep q-learning. TENCON 2021-2021 IEEE Region 10 Conference (TENCON). IEEE, 2021: 857-862.
[10] Cestero J, Quartulli M, Metelli A M, et al. Storehouse: A reinforcement learning environment for optimizing warehouse management. 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022: 1-9.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







