The automated transcription of spoken language, and meetings, in particular, is becoming more widespread as automatic speech recognition systems are becoming more accurate. This trend has significantly accelerated since the outbreak of the COVID-19 pandemic, which led to a major increase in the number of online meetings. However, the transcription of spoken language has not received much attention from the NLP community compared to documents and other forms of written language. In this paper, we study a variation of the summarization problem over the transcription of spoken language: given a transcribed meeting, and an action item (i.e., a commitment or request to perform a task), our goal is to generate a coherent and self-contained rephrasing of the action item. To this end, we compiled a novel dataset of annotated meeting transcripts, including human rephrasing of action items. We use state-of-the-art supervised text generation techniques and establish a strong baseline based on BART and UniLM (two pretrained transformer models). Due to the nature of natural speech, language is often broken and incomplete and the task is shown to be harder than an analogous task over email data. Particularly, we show that the baseline models can be greatly improved once models are provided with additional information. We compare two approaches: one incorporating features extracted by coreferenceresolution. Additional annotations are used to train an auxiliary model to detect the relevant context in the text. Based on the systematic human evaluation, our best models exhibit nearhuman-level rephrasing capability on a constrained subset of the problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.