Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1311
|View full text |Cite
|
Sign up to set email alerts
|

Subtopic-driven Multi-Document Summarization

Abstract: In multi-document summarization, a set of documents to be summarized is assumed to be on the same topic, known as the underlying topic in this paper. That is, the underlying topic can be collectively represented by all the documents in the set. Meanwhile, different documents may cover various different subtopics and the same subtopic can be across several documents. Inspired by topic model, the underlying topic of a document set can also be viewed as a collection of different subtopics of different importance.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 25 publications
0
12
0
Order By: Relevance
“…We adopt 100 words as the length limit in the DUC dataset, instead of 665 bytes specified by the official task. The change has also been made to provide the same setting for evaluating various methods in Hong et al (2014) and Zheng et al (2019). For the Yelp dataset, we set the limit to be the 99.5 th percentile less than the maximum length of any document; for Multi-News, the limit is set as 300 words.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…We adopt 100 words as the length limit in the DUC dataset, instead of 665 bytes specified by the official task. The change has also been made to provide the same setting for evaluating various methods in Hong et al (2014) and Zheng et al (2019). For the Yelp dataset, we set the limit to be the 99.5 th percentile less than the maximum length of any document; for Multi-News, the limit is set as 300 words.…”
Section: Methodsmentioning
confidence: 99%
“…In the experiments, we report the different combinations of ROUGE scores for each dataset, which have been recommended and adopted by previous works. Specifically, the recall scores of R-1,2,4 will be reported for the DUC 2004 dataset according to Hong et al (2014), Wang et al (2017) and Zheng et al (2019); the F1 scores of R-1,2,L will be reported for Yelp as in Chu and Liu (2019); the F1 scores of R-1,2,SU4 will be reported for Multi-News as in Fabbri et al (2019). The toolkit for computing ROUGE metrics is ROUGE-1.5.5 8 and its option is set to be '-m -c 95 -r 1000 -f A -p 0.5 -t 0'.…”
Section: Evaluation Metricsmentioning
confidence: 99%
See 2 more Smart Citations
“…Wei et al (2012) proposed to build a document graph consisting of words, sentences, and topic nodes and learn the graph with Markov chain. Zheng et al (2019) proposed to summarize multiple documents by mining cross-document subtopics. Narayan et al (2018) recommended enriching word representation with topical information.…”
Section: Topic Modeling For Summarizationmentioning
confidence: 99%