Common ground processes [26] can improve performance in communication tasks [72, 42, 43, 24], and understanding these processes will likely benefit human--computer dialogue interfaces. However, there are multiple proposed theories with different implications for interface design. Fusaroli and Tylén [40] achieved a direct comparison by designing two models: one based on alignment theory and the other based on complementarity theory that encapsulated interpersonal synergy and audience design. The current research used these models, extending them to differentiate between interpersonal synergy and audience design. Few studies have tested multiple common ground models against tasks representative of envisioned human--computer interaction (HCI) applications. We report on four such tests, which allowed examination of generalizability of findings. Results supported the complementarity models over the alignment model, and were suggestive of the audience design variant of complementarity, providing guidance for HCI design that differs from contemporary approaches.
Previously, spoken uncertainty has been analyzed using either lexical or acoustic features, but few, if any, studies have used both feature types in combination. Therefore, it is unknown to what extent these feature types provide redundant information. Additionally, prior research has focused on the study of acoustical features of only single words, and it is unclear if those results can generalize to perceived uncertainty in spontaneous speech. The current study elicited spontaneous speech through a team dialogue task in which two people worked together to locate street-level pictures of different houses on an overhead map. The communications were recorded, transcribed, broken into utterances, and presented to 10 individuals who rated each utterance on a 5-pt Likert scale from 1 (very uncertain) to 5 (very certain). A large number of acoustic and lexical features from the literature were calculated for each utterance. Random forest classification (Breiman, 2001) was used to select features and then investigate feature importance individually and also at the aggregate level of feature type. Results indicate that lexical features were much more important than acoustic features and suggest that previous findings using acoustic features might not generalize to spontaneous speech. Additional acoustic features are explored to improve performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.