2023
DOI: 10.1007/978-981-99-0617-8_11
|View full text |Cite
|
Sign up to set email alerts
|

Mastering “Gongzhu” with Self-play Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 12 publications
0
1
0
Order By: Relevance
“…The second sub-system uses the personality extracted by the previous component to influence the final decision for the next action to perform. At the start of the process, KataGo (Wu, 2020) provides a raw policy which consists in non-zero probabilities for all legal moves. As such, an optimal player would greedily select the action with the highest probability for each turn.…”
Section: Playing Go Against Microbesmentioning
confidence: 99%
“…The second sub-system uses the personality extracted by the previous component to influence the final decision for the next action to perform. At the start of the process, KataGo (Wu, 2020) provides a raw policy which consists in non-zero probabilities for all legal moves. As such, an optimal player would greedily select the action with the highest probability for each turn.…”
Section: Playing Go Against Microbesmentioning
confidence: 99%