Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.474
|View full text |Cite
|
Sign up to set email alerts
|

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Abstract: Professional summaries are written with document-level information, such as the theme of the document, in mind. This is in contrast with most seq2seq decoders which simultaneously learn to focus on salient content, while deciding what to generate, at each decoding step. With the motivation to narrow this gap, we introduce Focus Attention Mechanism, a simple yet effective method to encourage decoders to proactively generate tokens that are similar or topical to the input document. Further, we propose a Focus Sa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
29
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(29 citation statements)
references
References 68 publications
0
29
0
Order By: Relevance
“…The encoder is expected to turn input text in meaningful representations so a model can comprehend the input. When encoders learn wrong correlations between different parts of the training data, it could result in erroneous generation that diverges from the input [2,48,98,172].…”
Section: Hallucination From Training and Inferencementioning
confidence: 99%
See 3 more Smart Citations
“…The encoder is expected to turn input text in meaningful representations so a model can comprehend the input. When encoders learn wrong correlations between different parts of the training data, it could result in erroneous generation that diverges from the input [2,48,98,172].…”
Section: Hallucination From Training and Inferencementioning
confidence: 99%
“…Attention Mechanism is an integral implement selectively concentrating on relevant parts while ignoring others based on dependencies in neural networks [4,177]. In order to encourage the generator to pay more attention to the source, Aralikatte et al [2] introduce a short circuit from the input document to the vocabulary distribution via source-conditioned bias. Krishna et al [85] employ sparse attention to improve the model's long-range dependencies in the hope of modeling more retrieved documents to mitigate the hallucination in the answer.…”
Section: Information Augmentationmentioning
confidence: 99%
See 2 more Smart Citations
“…First, a separate correction model is learned to fix errors made by the summarizers (Zhao et al, 2020;, including replacing entities absent from the source or revising all possible errors (Cao et al, 2020). The second type targets at modifying the sequence-to-sequence architecture to incorporate relation triplets (Cao et al, 2018), knowledge graphs (Zhu et al, 2021), and topics (Aralikatte et al, 2021) to inform the summarizers of article facts. Yet additional engineering efforts and model retraining are often needed.…”
Section: Related Workmentioning
confidence: 99%