2024
DOI: 10.1109/tnnls.2022.3190509
|View full text |Cite
|
Sign up to set email alerts
|

Multiarmed Bandit Algorithms on Zynq System-on-Chip: Go Frequentist or Bayesian?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 18 publications
0
2
0
Order By: Relevance
“…During the experiment, the ε-Greedy algorithm simplifies the trade-off between exploration and exploitation through a fixed proportion of random exploration but may lack flexibility in dynamic environments [13]. The UCB algorithm [14] balances exploration and exploitation by calculating confidence upper bounds and often achieves better performance, especially when the reward distributions are relatively stable.…”
Section: Discussionmentioning
confidence: 99%
“…During the experiment, the ε-Greedy algorithm simplifies the trade-off between exploration and exploitation through a fixed proportion of random exploration but may lack flexibility in dynamic environments [13]. The UCB algorithm [14] balances exploration and exploitation by calculating confidence upper bounds and often achieves better performance, especially when the reward distributions are relatively stable.…”
Section: Discussionmentioning
confidence: 99%
“…The UCB algorithm, introduced by Auer et al (2002), is a widely recognized strategy for balancing the exploitation-exploration trade-off in the multiarmed bandit problem. It prioritizes arms with high average rewards and high uncertainty, allowing for efficient exploration of the action space [5].…”
Section: Definition and Developmentmentioning
confidence: 99%