2021
DOI: 10.1609/aaai.v35i1.16100
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search

Abstract: Monte Carlo tree search (MCTS) has achieved state-of-the-art results in many domains such as Go and Atari games when combining with deep neural networks (DNNs). When more simulations are executed, MCTS can achieve higher performance but also requires enormous amounts of CPU and GPU resources. However, not all states require a long searching time to identify the best action that the agent can find. For example, in 19x19 Go and NoGo, we found that for more than half of the states, the best action predicted by DN… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 27 publications
0
1
0
Order By: Relevance
“…In order to solve this issue, we employ a deep normalizing flow (DNF) to construct Q θ θ θ (H H H t |Φ Φ Φ t ). Compared with other generative models, e.g., variational auto-encoders (VAEs) and generative adversarial networks (GANs), DNF is a fully probabilistic model with tractable exact density inference, which can accelerate the search efficiency of Monte-Carlo tree search [29]. Although in this paper the exact density of the predicted channel states is not utilized, it can help reduce the number of simulations in our future work by evaluating the certainty of current solutions.…”
Section: A Channel Tracking With Historical Incomplete Observationsmentioning
confidence: 99%
“…In order to solve this issue, we employ a deep normalizing flow (DNF) to construct Q θ θ θ (H H H t |Φ Φ Φ t ). Compared with other generative models, e.g., variational auto-encoders (VAEs) and generative adversarial networks (GANs), DNF is a fully probabilistic model with tractable exact density inference, which can accelerate the search efficiency of Monte-Carlo tree search [29]. Although in this paper the exact density of the predicted channel states is not utilized, it can help reduce the number of simulations in our future work by evaluating the certainty of current solutions.…”
Section: A Channel Tracking With Historical Incomplete Observationsmentioning
confidence: 99%