Maximal expectation as upper confidence bound for multi-armed bandit problems

Kao, Kuo-Yuan; Chen, I-Hao

doi:10.1109/itaic.2014.7065060

Cited by 1 publication

(1 citation statement)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, Chen proposed the UCB-max algorithm, Roy referred to the UCB-KL algorithm and their introduction of the TV-KL-UCB algorithm, and Gil. proposed the UCB-RAD auxiliary algorithm, among others [4][5][6]. Considering the potential strong fluctuations in video rating data during execution, this paper also adopts the AUCB (Asymptotically Optimal UCB) algorithm to more accurately estimate the uncertainty of actions.…”

Section: Upper Confidence Bound (Ucb)mentioning

confidence: 99%

Optimizing video click-through rates with bandit algorithms

Liu

2024

ACE

View full text Add to dashboard Cite

In recent years, videos have increasingly influenced public perception, making video platforms a focal point of digital consumption. One critical challenge for platform operators is identifying videos that resonate most with users, as user ratings directly reflect viewer preferences and experiences. This study explores the use of bandit algorithms to predict and strategize the overall ratings of various anime videos on the Bilibili platform. Bandit algorithms, a subset of the multi-armed bandit model, dynamically adjust selection strategies based on prior feedback to maximize cumulative rewards. Our empirical research assessed multiple gambling algorithms, including the -greedy, Upper Confidence Bound (UCB), Explore-then-Commit (ETC), and Thompson Sampling (TS) algorithms. The findings indicate that the Thompson Sampling algorithm, in particular, achieved the lowest cumulative regret in selecting optimal videos on the Bilibili platform, showcasing its superior performance. This study highlights the potential of bandit algorithms to enhance video selection processes, ensuring that platforms can effectively cater to user preferences and enhance viewer satisfaction.

show abstract

Section: Upper Confidence Bound (Ucb)mentioning

confidence: 99%

Optimizing video click-through rates with bandit algorithms

Liu

2024

ACE

View full text Add to dashboard Cite

show abstract

Maximal expectation as upper confidence bound for multi-armed bandit problems

Cited by 1 publication

References 9 publications

Optimizing video click-through rates with bandit algorithms

Optimizing video click-through rates with bandit algorithms

Contact Info

Product

Resources

About