Echoing the results of our primary investigation with the full survey, here we present results according to our reduced xAI survey. An ANCOVA showed that certain conditions in our experiment were rated as significantly more explainable than others (F (7, 277) = 3.14, p = 0.003). Our independent variable is the explainability method and our dependent variable is the explainability score. We include as a covariate the participant's baseline explainability score. A Shapiro-Wilk test revealed that our data were not normally distributed, but we proceed with an ANCOVA due to a lack of non-parametric alternative and the robustness of the F-test (Cochran, 1947;Glass, Peckham, & Sanders, 1972;Hack, 1958;Pearson, 1931). A Tukey's HSD post-hoc analysis reveals that Counterfactual was rated as more explainable than Probability Scores (p = 0.002), as shown in Figure A1 The reduced questionnaire, after a factor analysis and verification is given in Table A1.
As high-speed, agile robots become more commonplace, these robots will have the potential to better aid and collaborate with humans. However, due to the increased agility and functionality of these robots, close collaboration with humans can create safety concerns that alter team dynamics and degrade task performance. In this work, we aim to enable the deployment of safe and trustworthy agile robots that operate in proximity with humans. We do so by 1) Proposing a novel human-robot doubles table tennis scenario to serve as a testbed for studying agile, proximate human-robot collaboration and 2) Conducting a user-study to understand how attributes of the robot (e.g., robot competency or capacity to communicate) impact team dynamics, perceived safety, and perceived trust, and how these latent factors affect human-robot collaboration (HRC) performance. We find that robot competency significantly increases perceived trust (𝑝 < .001), extending skill-to-trust assessments in prior studies to agile, proximate HRC. Furthermore, interestingly, we find that when the robot vocalizes its intention to perform a task, it results in a significant decrease in team performance (𝑝 = .037) and perceived safety of the system (𝑝 = .009).
CCS CONCEPTS• Human-centered computing → Empirical studies in collaborative and social computing; • Computer systems organization → Robotics.
As robots become more prevalent, the importance of the field of human-robot interaction (HRI) grows accordingly. As such, we should endeavor to employ the best statistical practices in HRI research. Likert scales are commonly used metrics in HRI to measure perceptions and attitudes. Due to misinformation or honest mistakes, many HRI researchers do not adopt best practices when analyzing Likert data. We conduct a review of psychometric literature to determine the current standard for Likert scale design and analysis. Next, we conduct a survey of five years of the International Conference on Human-Robot Interaction (HRIc) (2016 through 2020) and report on incorrect statistical practices and design of Likert scales [1, 2, 3, 5, 7]. During these years, only 4 of the 144 papers applied proper statistical testing to correctly-designed Likert scales. We additionally conduct a survey of best practices across several venues and provide a comparative analysis to determine how Likert practices differ across the field of Human-Robot Interaction. We find that a venue’s impact score negatively correlates with number of Likert related errors and acceptance rate, and total number of papers accepted per venue positively correlates with the number of errors. We also find statistically significant differences between venues for the frequency of misnomer and design errors. Our analysis suggests there are areas for meaningful improvement in the design and testing of Likert scales. Based on our findings, we provide guidelines and a tutorial for researchers for developing and analyzing Likert scales and associated data. We also detail a list of recommendations to improve the accuracy of conclusions drawn from Likert data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.