2020
DOI: 10.1109/tro.2020.2964147
|View full text |Cite
|
Sign up to set email alerts
|

Using Human Ratings for Feedback Control: A Supervised Learning Approach With Application to Rehabilitation Robotics

Abstract: This paper presents a method for tailoring a parametric controller based on human ratings. The method leverages supervised learning concepts in order to train a reward model from data. It is applied to a gait rehabilitation robot with the goal of teaching the robot how to walk patients physiologically. In this context, the reward model judges the physiology of the gait cycle (instead of therapists) using sensor measurements provided by the robot and the automatic feedback controller chooses the input settings … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 40 publications
0
7
0
Order By: Relevance
“…Further related-but conceptually different-approaches are to learn policies rather than an objective function [38], [39], which is often referred to as imitation learning, or to use labeled data in order to learn an objective function, e.g., using supervised learning [40]- [42]. Notably, [40] uses semisupervised learning with a similar motivation, where drivers are classified into aggressive and normal driving styles based on a few labeled data points.…”
Section: Imitation Learning and Supervised Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Further related-but conceptually different-approaches are to learn policies rather than an objective function [38], [39], which is often referred to as imitation learning, or to use labeled data in order to learn an objective function, e.g., using supervised learning [40]- [42]. Notably, [40] uses semisupervised learning with a similar motivation, where drivers are classified into aggressive and normal driving styles based on a few labeled data points.…”
Section: Imitation Learning and Supervised Learningmentioning
confidence: 99%
“…Therefore, we are more constrained in the solution but need less data and have properties that are invariant through the learning process, which we can use to enforce safe behaviors while learning. Compared to supervised learning methods [40]- [42], we do not require labeled data in order to learn a control objective. On the other hand, inverse learning methods that use unlabeled data, such as IOC, IRL, and our method, require the assumption that the data represent desirable behavior.…”
Section: Imitation Learning and Supervised Learningmentioning
confidence: 99%
“…2. The estimates Pi,t , qi,t and ri,t of the unknown parameters P i , q i and r i of U i are updated by means of an ad-hoc learning procedure (7). Such a procedure relies on a recursive least square scheme which makes use only of the most updated data (y i,t , x i,t ), thus not requiring to store and use all the past points generated by the distributed algorithm.…”
Section: Distributed Algorithm Descriptionmentioning
confidence: 99%
“…The aim of the learning part of Algorithm 1 (cf. (7)) is to provide a recursive scheme to let each agent i estimate the unknown parameters of U i . Specifically, the considered scheme aims at solving, for each t, the least squares (LS) problem minimize…”
Section: Parameters Estimation Via Recursive Least Squaresmentioning
confidence: 99%
See 1 more Smart Citation