Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.180
|View full text |Cite
|
Sign up to set email alerts
|

AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Abstract: Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for this task is the absence of a dataset to provide supervision for producing such summaries. Recent works propose heuristics to create such … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 35 publications
0
6
0
Order By: Relevance
“…However, our task aims to generate a unimodal output-that is, a purely textual summary. This is similar to the multimodal summarization done on the How2 Dataset (Sanabria et al, 2018 (Fabbri et al, 2022), ConvoSumm (Fabbri et al, 2021), SamSUM (Gliwa et al, 2019), CNN/DM (Nallapati et al, 2016), MSMO DailyMail (Zhu et al, 2018), and How2 (Yu et al, 2021, where a textual transcript of the video along with the video frames are generated into a text summary. (Yu et al, 2021) reported that incorporating the additional modality of the video frames into their summarization models showed improvement compared to text-only based models.…”
Section: Related Workmentioning
confidence: 72%
See 3 more Smart Citations
“…However, our task aims to generate a unimodal output-that is, a purely textual summary. This is similar to the multimodal summarization done on the How2 Dataset (Sanabria et al, 2018 (Fabbri et al, 2022), ConvoSumm (Fabbri et al, 2021), SamSUM (Gliwa et al, 2019), CNN/DM (Nallapati et al, 2016), MSMO DailyMail (Zhu et al, 2018), and How2 (Yu et al, 2021, where a textual transcript of the video along with the video frames are generated into a text summary. (Yu et al, 2021) reported that incorporating the additional modality of the video frames into their summarization models showed improvement compared to text-only based models.…”
Section: Related Workmentioning
confidence: 72%
“…ConvoSumm (Fabbri et al, 2021) presented a dataset of 2000 summarized forum threads, 500 from each of 4 different domains including NYT articles, Reddit, StackExchange, and Email threads. Answer-Summ (Fabbri et al, 2022) is another dataset consisting of 4,631 question-answering discussion threads sourced from StackExchange. Answer-Summ shares the most similarities with our dataset, as they also summarize multi-speaker threads, and their annotation pipeline shares some similarities with ours.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Recently, several query-focused summarization datasets have been introduced, which can be further divided into short-document datasets, whose source document length does not exceed the input limits of standard pretrained models, and longdocument datasets. Within short-document, queryfocused summarization, AnswerSumm (Fabbri et al, 2021c) is composed of summaries of answers to queries from StackExchange forums, while Wik-iHowQA (Deng et al, 2020) proposes the task of answer selection followed by the summarization of individual response articles to queries from the how-to site WikiHow. Within long-document summarization, WikiSum (Liu et al, 2018a) consists of Wikipedia article titles as queries, the first paragraph of the article as the summary, and documents referenced by the article as the input.…”
Section: Query-focused Summarizationmentioning
confidence: 99%