Accelerated -Greedy Multi Armed Bandit Algorithm for Online Sequential-Selections Applications

Amirizadeh, Khosrow; Rajeswari, M.

doi:10.17706/jsw.10.3.239-249

Cited by 2 publications

(1 citation statement)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Overview of generalized weighted averages [8]. The application of (8) in the UCB-tuned algorithm and the asymptotically optimal UCB is respectively as ( 9) and ( 10).…”

Section: Weighted Methods For Algorithm Optimizationmentioning

confidence: 99%

Enhancing UCB-tuned and Asymptotically Optimal UCB Algorithms through Weighted Average Techniques in Multi-Armed Bandit Scenarios

2024

HSET

View full text Add to dashboard Cite

This paper delves into the complexities of the Multi-Armed Bandit (MAB) problem, a fundamental concept in reinforcement learning and probability theory, with a focus on its application in recommendation systems and dynamic fields such as dynamic pricing and investment. It begins by shedding light on the essential paradox at the heart of the MAB problem – the balance between exploration and exploitation within limited parameters. The study primarily centers on Upper Confidence Bound (UCB) policies, especially UCB-tuned and Asymptotically Optimal UCB, noted for their adept balance between exploration and utilization. The novel contribution of this research is the enhancement of these UCB policies via an innovative weighted average method, leading to the development of WA-UCB-tuned and WA Asymptotically Optimal UCB algorithms. The research rigorously compares these optimized iterations with traditional UCB1, UCB-tuned, and Asymptotically Optimal UCB across varied MAB models featuring different numbers of arms. This study provides an exhaustive introduction to the MAB problem and pertinent UCB policies, the methodology behind the weighted average optimization, extensive experimental analysis, and comprehensive evaluations of the findings. The results showcase marked improvements in algorithmic performance, suggesting significant advancements in the domain of recommendation systems and other applications of the MAB problem.

show abstract

“…Overview of generalized weighted averages [8]. The application of (8) in the UCB-tuned algorithm and the asymptotically optimal UCB is respectively as ( 9) and ( 10).…”

Section: Weighted Methods For Algorithm Optimizationmentioning

confidence: 99%

Enhancing UCB-tuned and Asymptotically Optimal UCB Algorithms through Weighted Average Techniques in Multi-Armed Bandit Scenarios

2024

HSET

View full text Add to dashboard Cite

show abstract

Directed Exploration Via Learnable Probability Distribution For Random Action Selection

Giannakopoulos

Pikrakis

Cotronis

2020

2020 IEEE International Conference on Multimedia and Expo (ICME)

View full text Add to dashboard Cite

Accelerated -Greedy Multi Armed Bandit Algorithm for Online Sequential-Selections Applications

Cited by 2 publications

References 13 publications

Enhancing UCB-tuned and Asymptotically Optimal UCB Algorithms through Weighted Average Techniques in Multi-Armed Bandit Scenarios

Enhancing UCB-tuned and Asymptotically Optimal UCB Algorithms through Weighted Average Techniques in Multi-Armed Bandit Scenarios

Directed Exploration Via Learnable Probability Distribution For Random Action Selection

Contact Info

Product

Resources

About