Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.594
|View full text |Cite
|
Sign up to set email alerts
|

MSˆ2: Multi-Document Summarization of Medical Studies

Abstract: To assess the effectiveness of any medical intervention, researchers must conduct a timeintensive and manual literature review. NLP systems can help to automate or assist in parts of this expensive process. In support of this goal, we release MSˆ2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20K summaries derived from the scientific literature. This dataset facilitates the development of systems that can assess and aggregate contradictory evidence across multiple stud… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
33
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 43 publications
(33 citation statements)
references
References 44 publications
0
33
0
Order By: Relevance
“…Overall we evaluate (using both automatic metrics and human evaluation) a total of 23 models, two of which formed our official submissions to the leaderboard. 2 Both submitted models substantially outperform the baseline approaches (DeYoung et al, 2021) in terms of automatic metrics, and one achieves the best performance in terms of BERTScore and ROUGE-2 among all submissions. Overall, our contributions in comparison to the previously published domain-specific models for MDS are the following:…”
Section: Introductionmentioning
confidence: 92%
See 1 more Smart Citation
“…Overall we evaluate (using both automatic metrics and human evaluation) a total of 23 models, two of which formed our official submissions to the leaderboard. 2 Both submitted models substantially outperform the baseline approaches (DeYoung et al, 2021) in terms of automatic metrics, and one achieves the best performance in terms of BERTScore and ROUGE-2 among all submissions. Overall, our contributions in comparison to the previously published domain-specific models for MDS are the following:…”
Section: Introductionmentioning
confidence: 92%
“…In this paper we describe our experiments and results on the Multidocument Summarisation for Literature Review (MSLR) shared task. 1 In particular, we attempt to improve on previous multi-document summarisation models in the biomedical domain, which have tried to integrate domain knowledge by marking important biomedical entities (Wallace et al, 2021;DeYoung et al, 2021). We hypothesise that highlighting such entities by placing global attention on them will enable better aggregation and normalisation of related entities across documents, and thus improve the factuality of the generated summaries.…”
Section: Introductionmentioning
confidence: 99%
“…Typically, deep learning is used for abstracts [15,19,28,34,65,75] since presumably more training data are available, whereas for full papers, also called zone identification, hand-crafted features and linear models have been suggested [2,4,23,44]. However, deep learning approaches have also been applied successfully to full papers in related tasks such as argumentation mining [41], document summarisation [1,16,21,27], or n-ary relation extraction [25,33,36]. Thus, the potential of deep learning has not been fully exploited yet for sequential sentence classification on full papers, and no unified solution for abstracts as well as full papers exists.…”
Section: Introductionmentioning
confidence: 99%
“…Our setting presents unique challenges. First, our approach requires retrieving literature based on noisy EHR notes containing multitudes of information (e.g., medical history, ongoing treatments), unlike orthogonal efforts on extracting and summarizing scholarly information related to well-formed questions (e.g., the efficacy of ACE inhibitors in adult patients with type-2 diabetes) (Wallace, 2019; Lehman et al, 2019;DeYoung et al, 2020DeYoung et al, , 2021.…”
Section: Introductionmentioning
confidence: 99%