Comparative Evaluation, Challenges, and Diverse Applications of Multi-Armed Bandit Algorithms

Yunmeng Han

doi:10.54097/jdcjkj94

Authors

Yunmeng Han

DOI:

https://doi.org/10.54097/jdcjkj94

Keywords:

Multi-Armed Bandits, Application Scenarios, Real-Time Data Processing.

Abstract

This manuscript offers an exhaustive comparative study of key Multi-Armed Bandits (MAB) algorithms, exploring their challenges, potential resolutions, and varied applications in today's context. It concentrates on at least three principal algorithms: the Upper Confidence Bound (UCB), Thompson Sampling, and ε-greedy. The analysis critically assesses their performance, focusing on convergence rate, precision, and computational demands. Key challenges in non-static environments and large-scale deployments are identified, including the complexities of multi-objective optimization. The paper proposes innovative solutions such as adaptive algorithmic approaches and the integration of parallel computing frameworks to address these challenges. It further delves into a range of application domains, from online advertising and recommendation systems to clinical trial methodologies, drawing comparisons between traditional and novel applications. The discussion also encompasses critical issues like data scarcity, cold start problems, ethical considerations in algorithm design, and the intricacies of processing real-time data. In its concluding sections, the study sheds light on recent successful deployments of MAB algorithms, identifying core factors contributing to their effectiveness and forecasting future developmental trajectories. This comprehensive analysis provides a detailed overview of the MAB field, highlighting its significance and practical impact in various sectors.

Downloads

Download data is not yet available.

References

Lin, B. (2022). Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling. IEEE Conference on Evolutionary Computation.

Jiang, D., Luo, H., Wang, C., & Wang, Y. (2021). Multi-Armed Bandits and Reinforcement Learning: Advancing Decision Making in E-Commerce and Beyond. ACM SIGKDD Conference on Knowledge Discovery & Data Mining.

Wang, K., Verma, S., Mate, A., Shah, S., Taneja, A., Madhiwalla, N., Hegde, A., & Tambe, M. Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Care Domain. DBLP.

Wang, C., Wang, Y., Luo, H., Jiang, D., He, J., & Zheng, Z. (2023). 2nd Workshop on Multi-Armed Bandits and Reinforcement Learning: Advancing Decision Making in E-Commerce and Beyond. ACM SIGKDD Conference on Knowledge Discovery & Data Mining.

Jin, T., Xu, P., Xiao, X., & Anandkumar, A. (2022). Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits.

Dimakopoulou, M., Ren, Z., & Zhou, Z. (2021). Online Multi-Armed Bandits with Adaptive Inference.

Krishnamachari, B., & Yi Gai. Online learning algorithms for network optimization with unknown variables.

Schaefer, R. Parallel Problem Solving from Nature - PPSN XI, 11th International Conference, Kraków, Poland, September 11-15, 2010. Proceedings, Part II.

Rudolph, G., Jansen, T., Lucas, S., Poloni, C., & Beume, N. Parallel Problem Solving from Nature - PPSN X, 10th International Conference Dortmund, Germany, September 13-17, 2008, Proceedings.

Ai, Y., He, F., Lancaster, E., & Lee, Jiyoung. (2022). Application of machine learning for multi-community COVID-19 outbreak predictions with wastewater surveillance.

Long Tran-Thanh. (2012). Budget-limited multi-armed bandits.

Salman. F. Nofal, Fazeel Ahmad, Dr. Ahmad Shmakhy, et al. (2022). Multi-Lateral Jetting Technology Results in a 150% Uplift in Production During a Second Offshore Application in Abu Dhabi Offshore Field.

Comparative Evaluation, Challenges, and Diverse Applications of Multi-Armed Bandit Algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Indexing

Latest publications