2020
DOI: 10.48550/arxiv.2006.16498
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Accelerating Reinforcement Learning Agent with EEG-based Implicit Human Feedback

Abstract: Providing Reinforcement Learning (RL) agents with human feedback can dramatically improve various aspects of learning. However, previous methods require human observer to give inputs explicitly (e.g., press buttons, voice interface), burdening the human in the loop of RL agent's learning process. Further, it is sometimes difficult or impossible to obtain the explicit human advise (feedback), e.g., autonomous driving, disabled rehabilitation, etc. In this work, we investigate capturing human's intrinsic reactio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 50 publications
0
3
0
Order By: Relevance
“…As such, our work is closer to [2], [37], [40]- [42], which used nonverbal signals to identify critical states during robot operation, detect robot errors, and adjust robot behavior. Other types of feedback signals, such as those from brain-computer interfaces, have been used in HRI [43]- [45]; however, they are impractical for navigation tasks. Simulation in HRI.…”
Section: Related Workmentioning
confidence: 99%
“…As such, our work is closer to [2], [37], [40]- [42], which used nonverbal signals to identify critical states during robot operation, detect robot errors, and adjust robot behavior. Other types of feedback signals, such as those from brain-computer interfaces, have been used in HRI [43]- [45]; however, they are impractical for navigation tasks. Simulation in HRI.…”
Section: Related Workmentioning
confidence: 99%
“…X2T differs in that it learns from naturally-occurring feedback, which requires no additional effort from the user to train the agent. Other prior work trains RL agents from implicit signals, such as electroencephalography (Xu et al, 2020), peripheral pulse measurements (McDuff & Kapoor, 2019), facial expressions (Jaques et al, 2017;Cui et al, 2020), and clicks in web search (Radlinski & Joachims, 2006). X2T differs in that it trains an interface that always conditions on the user's input when selecting an action, rather than an autonomous agent that ignores user input after the training phase.…”
Section: Using the Reward Model To Select Actionsmentioning
confidence: 99%
“…The agent takes advantage of the implicit brain signals acquired from the human user when determining the appropriate agent action. Thus, the human user does not need to explicitly send action commands, significantly reducing the burden on the human user [7,8].…”
Section: Introductionmentioning
confidence: 99%