Review of Application of Model-free Reinforcement Learning in Intelligent Decision
DOI:
https://doi.org/10.54097/hset.v56i.10596Keywords:
artificial intelligence; model-free reinforcement learning; intelligent decision.Abstract
With the continuous development of intelligent technologies, the traditional field of decision making is gradually evolving towards Intelligent Decision (ID), but with the increasing complexity of the environment, the explosive growth of data volume and the uncertainty of the decision making process, the difficulty of decision analysis is in-creasing. As a branch of Machine Learning, Reinforcement Learning (RL) uses Agent to train and generate rewards from the environment, ultimately resulting in intelligent models. Model-free Reinforcement Learning (MFRL) is a type of reinforcement learning in which an Agent does not need a predefined model of the environment, but inter-acts directly with the environment and learns autonomously to generate optimal strategies for model generation in complex environments. Model-free Reinforcement Learning techniques applied in the field of Intelligent Decision making can improve the efficiency and accuracy of decision making in complex environments. In this paper, we provide an overview of Model-free Reinforcement Learning in intelligent decision making and introduce the basic principles of reinforcement learning and its two branches (Model-based Reinforcement Learning and Model-free Reinforcement Learning). Various algorithms of Model-free Reinforcement Learning are analyzed and disassembled from two different functions (value-based function and policy-based function), and the characteristics, applicability range, and research results of each algorithm are derived. The typical applications of Model-free Reinforcement Learning in the field of intelligent decision making are classified and analyzed. Finally, a summary and outlook on the application of Model-free Reinforcement Learning in Intelligent Decision making are presented.
Downloads
References
Qin, W., Li, N., Liu, X., Liu, X. Lei, Tong, Q., Liu, X. Hong. A review of model-free reinforcement learning research.[J]. Computer Science,2021,48(03):180-187.
Cao Lichun. Reinforcement learning target detection based on U-shaped network[D]. Inner Mongolia Normal University, 2021. doi:10.27230/d.cnki.gnmsu.2021.000747.
Fei Zhu, Yang-Yang Ge, Xing-Hong Ling, Quan Liu. A model-free secure reinforcement learning method based on restricted MDP[J]. Journal of Soft-ware,2022,33(08):3086-3102.DOI:10.13328/j.cnki.jos.006318.
Liu Q, Zhai J. W., Zhang Z. C., Zhong S., Zhou Q., Zhang P., Xu J.. A review of deep reinforcement learning[J]. Journal of Computer Science,2018,41(01):1-27.
Lai J, Wei JINGYI, Chen XILANG. A review of hierar-chical reinforcement learning[J]. Computer Engineering and Applications,2021,57(03):72-79.
He L, Shen L, Li F, Wang Z, Tang WQ. Policy reuse in re-inforcement learning: research progress[J]. Systems En-gineering and Electronics Technolo-gy,2022,44(03):884-899.
Dosovitskiy A, Koltun V. Learning to act by predicting the future[DB/OL]. (2017-02-14) [2021-04-02]. hops://arxiv.org/abs/611.01779.
Oh J, Chockalingam V, Singh S, et al. Control of memory, ac-tiveperception, and action in minecraft[DB/OL]. (2016-OS-30)[2021-04-02].hops://arxiv.org/abs/1605.09128.
Kempka M, Wydmuch M, Runc G, et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning[C]//IEEE Conference on Computational In-telligence and Games. Piscataway, USA: IEEE, 2016: 1-8.
Schulman J, Wolski F, Dhariwal P, et al. Proximal policy-optimization algorithms[DB/OL]. (2017-08-28) [2021-04-02]. hops://arxiv.org/abs/1707.06347.
Levine S, Abbeel P. Learning neural network policies with guided policy search under unknown dynam-ics[C]//Advancesin Neural Information Processing Sys-tems. La Jolla, USA:Neural Information Processing Sys-tems Foundation, 2014:1071-1079.
Levine S, Finn C, Darrell T, et al. End-to-end training of deep visuomotor policies[J]. The Journal of Machine Learning Re-search, 2016, 17(1): 1334-1373.
van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double Q-learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2016, 30(1): 2094-2100.
Bellemare M G, Dabney W, Munos R. A distributional perspective on reinforcement learning[J]. Proceedings of Machine Learning Research, 2017, 70: 449-458.
Fujimoto S, van Hoof H, Meger D. Addressing function approximation error in actor-critic methods[DB/OL]. (2018-10-22) [2021-04-02]. hops://arxiv.org/abs/1802.09477.
Luo Y, Xu H, Li Y, et al. Algorithmic framework for model based deep reinforcement learning with theoretical guarantees[DB/OL]. (2021-02-15) [2021-04-02]. hops://arxiv.org/abs/1807.03858.
Weber T, Racaniere S, Reichert D P, et al. Imagina-tion-augmented agents for deep reinforcement learn-ing[DB/OL].(2018-02-14) [2021-04-02]. hops://arxiv.org/abs/1707.06203.
Khansari M, Kappler D, Luo J L, et al. Action image rep-resentation: Learning scalable deep grasping policies with zero real world data[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2020: 3597-3603.
Popov I, Heess N, Lillicrap T, et al. Data-efficient deep reinforcement learning for dexterous manipula-tion[DB/OL].(2017-04-10) [2021-04-02]. hops://arxiv.org/abs/1704.03073.
Gupta A, Eppner C, Levine S, et al. Learning dexterous ma-nipulation for a soft robotic hand from human demonstra-lions[C]//IEEE/RSJ International Conference on IntelligentRobots and Systems. Piscataway, USA: IEEE, 2016: 3786-3793.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







