2022
DOI: 10.1145/3545176
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Survey on Long Document Summarization: Datasets, Models, and Metrics

Abstract: Long documents such as academic articles and business reports have been the standard format to detail out important issues and complicated subjects that require extra attention. An automatic summarization system that can effectively condense long documents into short and concise texts to encapsulate the most important information would thus be significant in aiding the reader’s comprehension. Recently, with the advent of neural architectures, significant research efforts have been made to advance automatic tex… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(23 citation statements)
references
References 120 publications
0
11
0
Order By: Relevance
“…We can cite the UDEG model (Jeong et al , 2021). This one uses an abstract automatic summarization system to summarize documents (Lu and Conrad, 2012; Koh et al , 2022). Its abstraction capacity allows it to introduce words into summaries that are not present in original documents.…”
Section: Related Workmentioning
confidence: 99%
“…We can cite the UDEG model (Jeong et al , 2021). This one uses an abstract automatic summarization system to summarize documents (Lu and Conrad, 2012; Koh et al , 2022). Its abstraction capacity allows it to introduce words into summaries that are not present in original documents.…”
Section: Related Workmentioning
confidence: 99%
“…While there do exist more powerful dialogue summarization models such as DialogLM [29] and Summ [28], we use the BART (Bidirectional and Auto-Regressive Transformers) model [12] due to its speed and high performance in long document summarization tasks [11]. In addition, there has been previous research in assessing different topic segmentation methods on the BART model, so this allows us to evaluate our techniques.…”
Section: Bart Model For Meeting Summarizationmentioning
confidence: 99%
“…Extractive summarization techniques locate the most important phrases and sentences from the input transcript and concatenate them to form a concise summary. However, the summaries generated by these techniques are usually awkward to read because of the forceful concatenation of unrelated sentences [11]. Abstractive summarization techniques focus more on understanding the overall meaning of a transcript and then generating a concise summary based on the entire text.…”
Section: Introductionmentioning
confidence: 99%
“…Performing salient content selection is a nature way to alleviate the problem and it has been widely used in other fields [15]. For example, Gehrmann et al [10] apply extract-abstract methods to improve summary performance.…”
Section: Query Reformulation and Content Selectionmentioning
confidence: 99%