Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Glavaš, Goran; Somasundaran, Swapna

doi:10.1609/aaai.v34i05.6284

Cited by 32 publications

(36 citation statements)

References 23 publications

(66 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The central ideas of each topic is summarized with one sentence, covering information from multiple utterances, i.e., t 1 for S 1 , t 2 for S 2 , and t 3 for S 3 . We also observe that utterances residing in the same topic (e.g., S 1 , S 2 and S 3 ) is inherently more coherent than those coming from different topics (e.g., the inter-topic snippet S 4 and S 5 ), which reveals the underlying relationships between topic and utterance coherence, also demonstrated by Glavaš and Somasundaran (2020).…”

Section: Introductionsupporting

confidence: 52%

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Liu¹,

Zou²,

Hai-nan³

et al. 2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

Unlike well-structured text, such as news reports and encyclopedia articles, dialogue content often comes from two or more interlocutors, exchanging information with each other. In such a scenario, the topic of a conversation can vary upon progression and the key information for a certain topic is often scattered across multiple utterances of different speakers, which poses challenges to abstractly summarize dialogues. To capture the various topic information of a conversation and outline salient facts for the captured topics, this work proposes two topic-aware contrastive learning objectives, namely coherence detection and sub-summary generation objectives, which are expected to implicitly model the topic change and handle information scattering challenges for the dialogue summarization task. The proposed contrastive objectives are framed as auxiliary tasks for the primary dialogue summarization task, united via an alternative parameter updating strategy. Extensive experiments on benchmark datasets demonstrate that the proposed simple method significantly outperforms strong baselines and achieves new state-of-the-art performance. The code and trained models are publicly available via https://github.com/Junpliu/ConDigSum.

show abstract

Section: Introductionsupporting

confidence: 52%

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Liu¹,

Zou²,

Hai-nan³

et al. 2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

show abstract

“…The closest model to ours is proposed in (Glava and Somasundaran, 2020) 1 where transformers are used for both the levels of the architecture. They also developed a semantic coherence measure on distinguishing pairs of genuine and fake text snippets as an auxiliary loss alongside the segment classification loss.…”

Section: Supervised Segmentation Modelsmentioning

confidence: 99%

“…However, with the advancement of self-learning and transfer learning on deep neural networks, there are more recent supervised modeling approaches proposed that aim to predict labeled segment boundaries on smaller datasets. (Koshorek et al, 2018;Xing et al, 2020;Barrow et al, 2020;Glava and Somasundaran, 2020) To the best of our knowledge, the most straightforward remedy to the above problems is knowledge transfer and distillation from pre-trained models. The rich pre-trained knowledge enables the training of a more general segmentation model on a small labeled dataset.…”

Section: Introductionmentioning

confidence: 99%

Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence

Lo¹,

Jin²,

Tan³

et al. 2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

This paper proposes a transformer over transformer framework, called Transformer 2 , to perform neural text segmentation. It consists of two components: bottom-level sentence encoders using pre-trained transformers, and an upper-level transformer-based segmentation model based on the sentence embeddings. The bottom-level component transfers the pre-trained knowledge learnt from large external corpora under both single and pair-wise supervised NLP tasks to model the sentence embeddings for the documents. Given the sentence embeddings, the upper-level transformer is trained to recover the segmentation boundaries as well as the topic labels of each sentence. Equipped with a multi-task loss and the pre-trained knowledge, Transformer 2 can better capture the semantic coherence within the same segments. Our experiments show that (1) Transformer 2 manages to surpass state-ofthe-art text segmentation models in terms of a commonly-used semantic coherence measure;(2) in most cases, both single and pair-wise pre-trained knowledge contribute to the model performance; (3) bottom-level sentence encoders pre-trained on specific languages yield better performance than those pre-trained on specific domains.

show abstract

“…Badjatiya et al [38] proposed an attention-based convolutional neural network bidirectional LSTM model that introduced the attention mechanism and learned the relative importance of each sentence in the text to achieve segmentation. Glavaš et al [39] proposed a multitask learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones.…”

Section: Text Segmentationmentioning

confidence: 99%

Paragraph Boundary Recognition in Novels for Story Understanding

Iikura¹,

Okada²,

Mori³

2021

Applied Sciences

View full text Add to dashboard Cite

The understanding of narrative stories by computer is an important task for their automatic generation. To date, high-performance neural-network technologies such as BERT have been applied to tasks such as the Story Cloze Test and Story Completion. In this study, we focus on the text segmentation of novels into paragraphs, which is an important writing technique for readers to deepen their understanding of the texts. This type of segmentation, which we call “paragraph boundary recognition”, can be considered to be a binary classification problem in terms of the presence or absence of a boundary, such as a paragraph between target sentences. However, in this case, the data imbalance becomes a bottleneck because the number of paragraphs is generally smaller than the number of sentences. To deal with this problem, we introduced several cost-sensitive loss functions, namely. focal loss, dice loss, and anchor loss, which were robust for imbalanced classification in BERT. In addition, introducing the threshold-moving technique into the model was effective in estimating paragraph boundaries. As a result of the experiment on three newly created datasets, BERT with dice loss and threshold moving obtained a higher F1 than the original BERT had using cross-entropy loss as its loss function (76% to 80%, 50% to 54%, 59% to 63%).

show abstract

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Cited by 32 publications

References 23 publications

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization

Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence

Paragraph Boundary Recognition in Novels for Story Understanding

Contact Info

Product

Resources

About