In recent years, conversational agents (CAs) have become ubiquitous and are a presence in our daily routines. It seems that the technology has finally ripened to advance the use of CAs in various domains, including commercial, healthcare, educational, political, industrial, and personal domains. In this study, the main areas in which CAs are successful are described along with the main technologies that enable the creation of CAs. Capable of conducting ongoing communication with humans, CAs are encountered in natural-language processing, deep learning, and technologies that integrate emotional aspects. The technologies used for the evaluation of CAs and publicly available datasets are outlined. In addition, several areas for future research are identified to address moral and security issues, given the current state of CA-related technological developments. The uniqueness of our review is that an overview of the concepts and building blocks of CAs is provided, and CAs are categorized according to their abilities and main application domains. In addition, the primary tools and datasets that may be useful for the development and evaluation of CAs of different categories are described. Finally, some thoughts and directions for future research are provided, and domains that may benefit from conversational agents are introduced.
Children with special needs may struggle to identify uncomfortable and unsafe situations. In this study, we aimed at developing an automated system that can detect such situations based on audio and text cues to encourage children’s safety and prevent situations of violence toward them. We composed a text and audio database with over 1891 sentences extracted from videos presenting real-world situations, and categorized them into three classes: neutral sentences, insulting sentences, and sentences indicating unsafe conditions. We compared insulting and unsafe sentence-detection abilities of various machine-learning methods. In particular, we found that a deep neural network that accepts the text embedding vectors of bidirectional encoder representations from transformers (BERT) and audio embedding vectors of Wav2Vec as input attains the highest accuracy in detecting unsafe and insulting situations. Our results indicate that it may be applicable to build an automated agent that can detect unsafe and unpleasant situations that children with special needs may encounter, given the dialogue contexts conducted with these children.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.