Multi-party dialogue question answering (MPDQA) is an emerging topic in speech and language processing where the goal is to answer the questions according to the multiparty conversations. Different from conventional QA, which assumes a single speaker (writer) and general listeners (readers), MPDQA involves multiple speakers and specific listeners. Prior works simply treat the dialogues as plain passages, which neglect the importance of role awareness, such as speakers' perspectives and co-references. In a novel aspect, this paper proposes the Role Aware Multi-Party Network (RAMPNet), a model utilizing the information of speaker and role to present "who is speaking" and "who is mentioned", making role awareness an available message for our model. Experiments show that our RAMPNet outperforms the BERT baseline model on a large-scale MPDQA dataset, FriendsQA, especially in the "Who" and "How" questions, which strongly need the ability of conversation relations understanding to answer these questions. In addition, our further analysis demonstrates RAMPNet's effectiveness in those questions contain verbs such as "reply" or "talk", related to interactions between speakers and roles. All of the results show the capability and utility of RAMPNet.
The human ability of deep cognitive skills are crucial for the development of various real-world applications that process diverse and abundant user generated input. While recent progress of deep learning and natural language processing have enabled learning system to reach human performance on some benchmarks requiring shallow semantics, such human ability still remains challenging for even modern contextual embedding models, as pointed out by many recent studies [9,10,22,24,32]. Existing machine comprehension datasets assume sentence-level input, lack of casual or motivational inferences, or could be answered with question-answer bias. Here, we present a challenging novel task, trope detection on films, in an effort to create a situation and behavior understanding for machines. Tropes are storytelling devices that are frequently used as ingredients in recipes for creative works. Comparing to existing movie tag prediction tasks, tropes are more sophisticated as they can vary widely, from a moral concept to a series of circumstances, and embedded with motivations and cause-and-effects. We introduce a new dataset, Tropes in Movie Synopses (TiMoS), with 5623 movie synopses and 95 different tropes collecting from a Wikipedia-style database, TVTropes. We present a multi-stream comprehension network (MulCom) leveraging multi-level attention of words, sentences, and role relations. Experimental result demonstrates that modern models including BERT contextual embedding, movie tag prediction systems, and relational networks, perform at most 37% of human performance (23.97/64.87) in terms of F1 score. Our MulCom outperforms all modern baselines, by 1.5 to 5.0 F1 score and 1.5 to 3.0 mean of average precision (mAP) score. We also provide a detailed analysis and human evaluation to pave ways for future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.