Self-Attention Guided Copy Mechanism for Abstractive Summarization

Xu, Shanjia; Li, Haoran; Yuan, Peng; Wu, Youzheng; He, Xiaodong; Zhou, Bowen

doi:10.18653/v1/2020.acl-main.125

Cited by 62 publications

(43 citation statements)

References 28 publications

(28 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We see similar kind of improvements as observed in Table 1, except for ROUGE-2 for ROBFAME which is 0.23 points worse than the ROBERTAS2S baseline. Our best model PEG-FAME performs better than both copy mechanism models: LSTM-based PtGen (See et al, 2017) and Transformer-based SAGCopy (Xu et al, 2020). PEGFAME performs worse when compared with T5 (Raffel et al, 2019), the original PEGASUS and ProphetNet (Qi et al, 2020).…”

Section: Bias In Datamentioning

confidence: 89%

“…Cao et al (2017) force faithful generation by conditioning on both source text and extracted fact descriptions from the source text. Song et al (2020) propose to jointly generate a sentence and its syntactic dependency parse to induce grammaticality and faithfulness. Tian et al (2019) learn a confidence score to ensure that the model attends to the source whenever necessary.…”

Section: Topic-aware Generation Modelsmentioning

confidence: 99%

See 1 more Smart Citation

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Aralikatte¹,

Narayan²,

Maynez³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Professional summaries are written with document-level information, such as the theme of the document, in mind. This is in contrast with most seq2seq decoders which simultaneously learn to focus on salient content, while deciding what to generate, at each decoding step. With the motivation to narrow this gap, we introduce Focus Attention Mechanism, a simple yet effective method to encourage decoders to proactively generate tokens that are similar or topical to the input document. Further, we propose a Focus Sampling method to enable generation of diverse summaries, an area currently understudied in summarization. When evaluated on the BBC extreme summarization task, two state-of-the-art models augmented with Focus Attention generate summaries that are closer to the target and more faithful to their input documents, outperforming their vanilla counterparts on ROUGE and multiple faithfulness measures. We also empirically demonstrate that Focus Sampling is more effective in generating diverse and faithful summaries than top-k or nucleus samplingbased decoding methods.

show abstract

Section: Bias In Datamentioning

confidence: 89%

Section: Topic-aware Generation Modelsmentioning

confidence: 99%

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Aralikatte¹,

Narayan²,

Maynez³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…Previous encoder-decoder models (Rush et al, 2015;Nallapati et al, 2016;Paulus et al, 2018;Chopra et al, 2016) equipped with the attention mechanism (Bahdanau et al, 2015) have achieved great performance on abstractive summarization. However, they were found to miss some important content in input documents (Li et al, 2018;Xu et al, 2020). How to retain the key information of input documents in the generated summaries has received increasing attention in the past few years.…”

Section: Related Workmentioning

confidence: 99%

“…Gehrmann et al (2018) utilize the attention masks to restrict copying phrases from the selected parts of an input document. Xu et al (2020) explicitly guide the copy process with the centrality of each source word. Several papers also explore the potential of enhancing the encoder.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Highlight-Transformer: Leveraging Key Phrase Aware Attention to Improve Abstractive Multi-Document Summarization

Liu¹,

Cao²,

Yang³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

ive multi-document summarization aims to generate a comprehensive summary covering salient content from multiple input documents. Compared with previous RNNbased models, the Transformer-based models employ the self-attention mechanism to capture the dependencies in input documents and can generate better summaries. Existing works have not considered key phrases in determining attention weights of self-attention. Consequently, some of the tokens within key phrases only receive small attention weights. It can affect completely encoding key phrases that convey the salient ideas of input documents. In this paper, we introduce the Highlight-Transformer, a model with the highlighting mechanism in the encoder to assign greater attention weights for the tokens within key phrases. We propose two structures of highlighting attention for each head and the multihead highlighting attention. The experimental results on the Multi-News dataset show that our proposed model significantly outperforms the competitive baseline models.

show abstract

Exploring the Incorporation of Opinion Polarity for Abstractive Multi-document Summarisation

Ramsauer

Kruschwitz

2021

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Self-Attention Guided Copy Mechanism for Abstractive Summarization

Cited by 62 publications

References 28 publications

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Highlight-Transformer: Leveraging Key Phrase Aware Attention to Improve Abstractive Multi-Document Summarization

Exploring the Incorporation of Opinion Polarity for Abstractive Multi-document Summarisation

Contact Info

Product

Resources

About