Co-training by Experience Replay for Reinforcement Learning

Authors

  • Yuyang Huang

DOI:

https://doi.org/10.54097/hset.v39i.6585

Keywords:

Reinforcement Learning; Co-training; Experience Replay; ViZDoom; Duel DQN.

Abstract

 In this paper, to improve the efficiency of the reinforcement learning model to explore the environment and get better results, a new method which involves the co-training process in reinforcement learning by sharing the experience pool of each agent in the training process has been developed. In this method, agents can gain a better understanding of the environment since agents use different policies to make action and explore the environment. At the same time, this paper designed an agent called Hard Memory Collector by modifying the value function and combining this agent and a normal agent for co-training. As an experimental result on the ViZDoom platform, the model achieved better results than the original Duel DQN network in terms of score, steps used per game and loss value.

Downloads

Download data is not yet available.

References

M. Kempka, M. Wydmuch, G. Runc, J. Toczek, and W. Jaskowski, “ViZDoom: A Doom-based AI research platform for visual reinforcement learning,” in 2016 IEEE Conference on Computational Intelligence and Games (CIG), Santorini, Greece, Sep. 2016, pp. 1–8. doi: 10.1109/CIG.2016.7860433.

M. Wydmuch, M. Kempka, and W. Jaśkowski, “ViZDoom Competitions: Playing Doom from Pixels,” IEEE Trans. Games, vol. 11, no. 3, pp. 248–259, Sep. 2019, doi: 10.1109/TG.2018.2877047.

H. Guan, “Analysis on Deep Reinforcement Learning in Industrial Robotic Arm,” in 2020 International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI), Sanya, China, Dec. 2020, pp. 426–430. doi: 10.1109/ICHCI51889.2020.00094.

G. Lample and D. S. Chaplot, “Playing FPS Games with Deep Reinforcement Learning,” AAAI, vol. 31, no. 1, Feb. 2017, doi: 10.1609/aaai. v31i1.10827.

A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with Co-Training y,” p. 10.

J. Song, R. Lanka, Y. Yue, and M. Ono, “Co-training for Policy Learning,” p. 11.

M. Andrychowicz et al., “Hindsight Experience Replay.” arXiv, Feb. 23, 2018. Accessed: Aug. 07, 2022. [Online]. Available: http://arxiv.org/abs/1707.01495.

J. Foerster et al., “Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning,” p. 10.

R. Liu and J. Zou, “The Effects of Memory Replay in Reinforcement Learning.” arXiv, Oct. 17, 2017. Accessed: Aug. 28, 2022. [Online]. Available: http://arxiv.org/abs/1710.06574.

Z. Wang, T. Schaul, M. Hessel, H. van Hasselt, M. Lanctot, and N. de Freitas, “Dueling Network Architectures for Deep Reinforcement Learning,” p. 9.

V. Mnih et al., “Playing Atari with Deep Reinforcement Learning.” arXiv, Dec. 19, 2013. Accessed: Aug. 28, 2022. [Online]. Available: http://arxiv.org/abs/1312.5602.

Downloads

Published

01-04-2023

How to Cite

Huang, Y. (2023). Co-training by Experience Replay for Reinforcement Learning. Highlights in Science, Engineering and Technology, 39, 545-549. https://doi.org/10.54097/hset.v39i.6585