Exploration of Teaching Reform in Reinforcement Learning Courses Based on Model Framework Programming

Authors

  • Weifeng Xu
  • Mingquan Zhang
  • Hongtao Wang

DOI:

https://doi.org/10.54097/q1h8v934

Keywords:

Reinforcement Learning, Model Framework Programming, Practical Teaching

Abstract

In this paper, we summarize and analyze the problems in reinforcement learning teaching, and explores a new teaching method of reinforcement learning core algorithms based on model framework programming and through case concatenation. Firstly, we quickly construct a simulation environment for the "drone data collection" case through model framework programming. Secondly, we model the Markov decision process for this case and elaborate on the similarities and differences between deep Q-learning algorithms and actor judge algorithms. Finally, by combining case programming, we visually explain the core ideas and characteristics of the two algorithms in order to improve students' hands-on ability and enhance the teaching quality of reinforcement learning courses.

Downloads

Download data is not yet available.

References

[1] Allah Mottaki N, Motameni H, Mohamadi H. A genetic algorithm-based approach for solving the target Q-coverage problem in over and under provisioned directional sensor networks[J/OL]. Physical Communication, (2021-01-01)[2023-01-08]. https:// doi.org/10. 1016/j.phycom. 2022. 101719.

[2] Yang Xiao. Research and application of key technologies for autonomous network based on deep reinforcement learning [D]. Beijing University of Posts and telecommunications, 2024.DOI:10. 26969/d.cnki.gbydu.2024. 000285.

[3] Zhang Guangchi, He Zinan, Cui Miao. Energy Consumption Optimization of Unmanned Aerial Vehicle Assisted Mobile Edge Computing Systems Based on Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2023,45(05):1635-1643. doi: 10.11999/JEIT220352.

[4] Cao X, Xu W, Liu X, et al. A deep reinforcement learning-based on-demand charging algorithm for wireless rechargeable sensor networks[J/OL]. Ad Hoc Networks,, (2021-01-01)[2023-01-08]. https://doi.org/10.1016/j.adhoc.2020.102278. DOI: https://doi.org/10.1016/j.adhoc.2020.102278

[5] Liheng Lv. High-Accuracy Non-Intrusive Load Monitoring Algorithms Based on Deep Learning [D]. Jilin University, 2024.

[6] Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning[J/OL]. arXiv preprint, (2013-12-19) [2023-01-08]. https://doi.org/10.48550/arXiv.1312.5602.

[7] Liu K, Zheng J. UAV trajectory optimization for time-constrained data collection in UAV-enabled environmental monitoring systems[J]. IEEE Internet of Things Journal, 2022, 9(23): 24300-24314. doi: 10.1109/JIOT.2022.3189214. DOI: https://doi.org/10.1109/JIOT.2022.3189214

[8] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications with unmanned aerial vehicles: Opportunities and challenges,” IEEE Commun. Mag., vol. 54, no. 5, pp. 36–42, May 2016. DOI: https://doi.org/10.1109/MCOM.2016.7470933

[9] Y. Zeng, J. Xu, and R. Zhang, “Energy minimization for wireless communication with rotary-wing UAV,” IEEE Trans. Wireless Commun., vol. 18, no. 4, pp. 2329–2345, Apr. 2019. DOI: https://doi.org/10.1109/TWC.2019.2902559

[10] Q. Zhang, M. Jiang, Z. Feng, W. Li, W. Zhang, and M. Pan, “IoT enabled UAV: Network architecture and routing algorithm,” IEEE Internet Things J., vol. 6, no. 2, pp. 3727–3742, Apr. 2019. DOI: https://doi.org/10.1109/JIOT.2018.2890428

Downloads

Published

15-12-2024

Issue

Section

Articles

How to Cite

Xu, W., Zhang, M., & Wang, H. (2024). Exploration of Teaching Reform in Reinforcement Learning Courses Based on Model Framework Programming. Journal of Education and Educational Research, 11(3), 118-120. https://doi.org/10.54097/q1h8v934