The increasing maturity of artificial intelligence technologies such as Machine Learning algorithms, Natural Language Processing (NLP), Automatic Speech Recognition (ASR) and Natural Language generation are changing the way users interact with technology. Specifically, as voice interactions are becoming commonplace, it is important to understand how such systems are being trained. This systematic review investigates how human data is collected for training conversational agents, with specific interest on data sets directly obtained from human participation in real contexts of need and use. The work reported in this article was supported by PRISMA guidelines and search procedures were led in Scopus, Web of Science and ProQuest, in English and within the last 15-years (2005-2020), with pre-defined criteria to get a detailed holistic perspective of practices published until July 2020. From both search iterations, a total of 22 papers were considered for this review. The main contributions from these papers reveal a common use of learning from demonstration/observation and crowdsourcing methods, in system training and dataset cataloguing, alongside handwriting and sentence labelling and Wizard-of-Oz based studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.