Reinforcement learning (RL) has achieved tremendous success as a general framework for learning how to make decisions. However, this success relies on the interactive hand-tuning of a reward function by RL experts. On the other hand, inverse reinforcement learning (IRL) seeks to learn a reward function from readily-obtained human demonstrations. Yet, IRL suffers from two major limitations: 1) reward ambiguitythere are an infinite number of possible reward functions that could explain an expert's demonstration and 2) heterogeneityhuman experts adopt varying strategies and preferences, which makes learning from multiple demonstrators difficult due to the common assumption that demonstrators seeks to maximize the same reward. In this work, we propose a method to jointly infer a task goal and humans' strategic preferences via network distillation. This approach enables us to distill a robust task reward (addressing reward ambiguity) and to model each strategy's objective (handling heterogeneity). We demonstrate our algorithm can better recover task reward and strategy rewards and imitate the strategies in two simulated tasks and a real-world table tennis task. CCS CONCEPTS• Theory of computation → Inverse reinforcement learning; • Computing methodologies → Learning from demonstrations; Neural networks.
No abstract
As robots become more prevalent, the importance of the field of human-robot interaction (HRI) grows accordingly. As such, we should endeavor to employ the best statistical practices in HRI research. Likert scales are commonly used metrics in HRI to measure perceptions and attitudes. Due to misinformation or honest mistakes, many HRI researchers do not adopt best practices when analyzing Likert data. We conduct a review of psychometric literature to determine the current standard for Likert scale design and analysis. Next, we conduct a survey of five years of the International Conference on Human-Robot Interaction (HRIc) (2016 through 2020) and report on incorrect statistical practices and design of Likert scales [1, 2, 3, 5, 7]. During these years, only 4 of the 144 papers applied proper statistical testing to correctly-designed Likert scales. We additionally conduct a survey of best practices across several venues and provide a comparative analysis to determine how Likert practices differ across the field of Human-Robot Interaction. We find that a venue’s impact score negatively correlates with number of Likert related errors and acceptance rate, and total number of papers accepted per venue positively correlates with the number of errors. We also find statistically significant differences between venues for the frequency of misnomer and design errors. Our analysis suggests there are areas for meaningful improvement in the design and testing of Likert scales. Based on our findings, we provide guidelines and a tutorial for researchers for developing and analyzing Likert scales and associated data. We also detail a list of recommendations to improve the accuracy of conclusions drawn from Likert data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.