2022
DOI: 10.1152/jn.00229.2021
|View full text |Cite
|
Sign up to set email alerts
|

Clustering analysis of movement kinematics in reinforcement learning

Abstract: Reinforcement learning has been used as an experimental model of motor skill acquisition, where at times movements are successful and thus reinforced. One fundamental problem is to understand how humans select exploration over exploitation during learning. The decision could be influenced by factors such as task demands and reward availability. In this study, we applied a clustering algorithm to examine how a change in the accuracy requirements of a task affected the choice of exploration over exploitation. Pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(16 citation statements)
references
References 38 publications
0
16
0
Order By: Relevance
“…Those explored coordination patterns are strongly anchored for a period of time (i.e., with the highest probability to transit towards them), but still are not the final destination of the learner, which fits the definition of a local strategy 1 . This exploration dynamics, as an information gathering process is more than merely a global trial to trial variability of the behavior during practice as proposed previously 19 , 30 . Indeed, a global ratio between the transition of coordination patterns between trials and the repetition of similar coordination patterns between trial (e.g., the exploration/exploitation ratio 19 , 30 ) does not account for this exploratory dynamics during learning as it does not account for the order or sequencing of the transitions.…”
Section: Discussionmentioning
confidence: 95%
“…Those explored coordination patterns are strongly anchored for a period of time (i.e., with the highest probability to transit towards them), but still are not the final destination of the learner, which fits the definition of a local strategy 1 . This exploration dynamics, as an information gathering process is more than merely a global trial to trial variability of the behavior during practice as proposed previously 19 , 30 . Indeed, a global ratio between the transition of coordination patterns between trials and the repetition of similar coordination patterns between trial (e.g., the exploration/exploitation ratio 19 , 30 ) does not account for this exploratory dynamics during learning as it does not account for the order or sequencing of the transitions.…”
Section: Discussionmentioning
confidence: 95%
“…In fact, even if some patterns are only little explored, it seems bene cial for the learner to still be able to explore them as they may rapidly gather information from their exploration. The initial exibility of a learner could therefore form the basis for an effective exploration during learning, as recently highlighted by Sidarta et al 25 . Narrowing the region of exploration by limiting the potential patterns that can be visited during learning can impact the individual dynamics of the performance because the individually necessary information therefore cannot be gathered through an effective exploration 38 .…”
Section: Discussionmentioning
confidence: 97%
“…Those explored coordination patterns are strongly anchored for a period of time (i.e., with the highest probability to transit towards them), but still are not the nal destination of the learner, which ts the de nition of a local strategy 1 . This exploration dynamics, as an information gathering process is more than merely a global trial to trial variability of the behavior during practice as proposed previously 19,25 . Indeed, a global ratio between the transition of coordination patterns between trials and the repetition of similar coordination patterns between trial (e.g., the exploration/exploitation ratio 19,25 ) does not account for this exploratory dynamics during learning as it does not account for the order or sequencing of the transitions.…”
Section: Discussionmentioning
confidence: 98%
See 1 more Smart Citation
“…This result suggests that more structured variations could more effectively guide learning. Moreover, recent results demonstrated that practice conditions leading to excessive exploration of movement solutions can be detrimental to learning (Sidarta et al, 2022). In fact, newly discovered behavioral solutions require exploitation during practice to stabilize them in the learners' repertoire (Hossner et al, 2016;Komar et al, 2019).…”
Section: Induced Variable Practice In Learning Protocolmentioning
confidence: 99%