Sentence extraction-based presentation summarization techniques and evaluation metrics

Chinese Spoken Language Processing

Wang

et al. 2006

The purpose of extractive summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a target summarization ratio and then sequence them to form a concise summary. In the paper, we proposed the use of probabilistic latent topical information for extractive summarization of spoken documents. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the conventional vector space model and latent semantic indexing model, as well as the HMM model. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained.

Section: Resultsmentioning

confidence: 99%

Section: Hj Djmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Extractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models

Chinese Spoken Language Processing

Wang

et al. 2006

“…There have also been attempts to apply this metric to text summaries of speech data with mixed results (see Nenkova and McKeown (2011) for a review). ROUGE performed reasonably well for the evaluation of text summaries of spoken presentations (Hirohata et al, 2005), but was not correlated with the summary accuracy of summaries of meetings or conversations (although see (Penn and Zhu, 2008)). …”

Section: Introductionmentioning

confidence: 99%

Automatic evaluation of spoken summaries: the case of language assessment

Loukina

Zechner

Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications

2014

This paper investigates whether ROUGE, a popular metric for the evaluation of automated written summaries, can be applied to the assessment of spoken summaries produced by non-native speakers of English. We demonstrate that ROUGE, with its emphasis on the recall of information, is particularly suited to the assessment of the summarization quality of non-native speakers' responses. A standard baseline implementation of ROUGE-1 computed over the output of the automated speech recognizer has a Spearman correlation of ρ = 0.55 with experts' scores of speakers' proficiency (ρ = 0.51 for a content-vector baseline). Further increases in agreement with experts' scores can be achieved by using types instead of tokens for the computation of word frequencies for both candidate and reference summaries, as well as by using multiple reference summaries instead of a single one. These modifications increase the correlation with experts' scores to a Spearman correlation of ρ = 0.65. Furthermore, we found that the choice of reference summaries does not have any impact on performance, and that the adjusted metric is also robust to errors introduced by automated speech recognition (ρ = 0.67 for human transcriptions vs. ρ = 0.65 for speech recognition output).

“…In [3,4], the authors suggested that important sentences can be selected from the significant parts of a document. For example, sentences can be selected form the introductory and concluding parts.…”

Section: Introductionmentioning

confidence: 99%

Word Topical Mixture Models for Extractive Spoken Document Summarization

Multimedia and Expo, 2007 IEEE International Conference On

2007

This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem under a probabilistic generative framework. A word topical mixture model (w-TMM) was proposed to explore the cooccurrence relationship between words of the language. Each sentence of the spoken document to be summarized was treated as a composite word TMM model for generating the document, and sentences were ranked and selected according to their likelihoods. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the other conventional summarization approaches. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained. The proposed summarization technique has also been properly integrated into our prototype system for voice retrieval of broadcast news via mobile devices.