Multi-Armed Bandit Algorithms: Analysis and Applications Across Domains

Authors

  • Qinchuan Zhang

DOI:

https://doi.org/10.54097/apzhv358

Keywords:

Multi-Armed Bandit Algorithms; Decision-Making; Uncertainty.

Abstract

This study provides an in-depth exploration of the pivotal role of Multi-Armed Bandit (MAB) Algorithms in decision-making across diverse sectors, focusing on their theoretical foundations, real-world applications, and empirical evidence. MAB Algorithms, metaphorically representing choices among various slot machine arms with different rewards, are crucial in optimizing decisions in uncertain settings by striking a balance between exploration and exploitation. It examines four principal algorithms—Greedy, Epsilon-Greedy, Upper Confidence Bound, and Thompson Sampling—each tailored for specific types of decision-making scenarios. Their applications are extensive, particularly in fields like recommendation systems, financial strategy formulation, and network security, where they enable adaptive learning and strategic optimization. In the context of the 5G era, MAB Algorithms are instrumental in effectively managing wireless network resources amidst dynamic conditions. This is further exemplified through empirical studies, such as research on decision-making under uncertainty, which demonstrate the algorithms' effectiveness in guiding choices in experimental setups. The paper highlights the growing importance and sophistication of MAB Algorithms, emphasizing their significant role in advancing human decision-making capabilities.

Downloads

Download data is not yet available.

References

Bouneffouf, D., & Rish, I. (2019). A survey on practical applications of multi-armed and contextual bandits.

Bouneffouf, D., Rish, I., & Aggarwal, C. (2020, July). Survey on applications of multi-armed and contextual bandits. In 2020 IEEE Congress on Evolutionary Computation (CEC) (pp. 1-8). IEEE.

Silva, N., Werneck, H., Silva, T., Pereira, A. C., & Rocha, L. (2022). Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions. Expert Systems with Applications, 197, 116669.

Agrawal, S., Tiwari, A., Naik, P., & Srivastava, A. (2021). Improved differential evolution based on multi-armed bandit for multimodal optimization problems. Applied Intelligence, 1-22.

Neto, W. L., Li, Y., Gaillardon, P. E., & Yu, C. (2022). End-to-end Automatic Logic Optimization Exploration via Domain-specific Multi-armed Bandit. ar**v preprint ar**v:2202.07721.

Kreutzer, J., Vilar, D., & Sokolov, A. (2021). Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits. ar**v preprint ar**v:2110.06997.

Chen, Y., Cuellar, A., Luo, H., Modi, J., Nemlekar, H., & Nikolaidis, S. (2020, August). Fair contextual multi-armed bandits: Theory and experiments. In Conference on Uncertainty in Artificial Intelligence (pp. 181-190). PMLR.

Guo, H., Pasunuru, R., & Bansal, M. (2020, April). Multi-source domain adaptation for text classification via distancenet-bandits. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 05, pp. 7830-7838).

Zhu, X., Xu, H., Zhao, Z., & others. (2021). an Environmental Intrusion Detection Technology Based on WiFi. Wireless Personal Communications, 119(2), 1425-1436.

Balakrishnan, A., Bouneffouf, D., Mattei, N., & Rossi, F. (2019). Using multi-armed bandits to learn ethical priorities for online AI systems. IBM Journal of Research and Development, 63(4/5), 1-1.

Downloads

Published

26-04-2024

How to Cite

Zhang, Q. (2024). Multi-Armed Bandit Algorithms: Analysis and Applications Across Domains. Highlights in Science, Engineering and Technology, 94, 170-174. https://doi.org/10.54097/apzhv358