Maximum entropy segmentation of broadcast news

Christensen, Heidi; Kolluru, BalaKrishna; Gotoh, Yoshihiko; Renals, Steve

doi:10.1109/icassp.2005.1415292

Cited by 16 publications

(13 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Stokes and Carthy [39] proposed a lexical chaining-based approach to coarse-grained segmentation of CNN news transcripts. Christensen and Kolluru [7] presented a cascading Automatic Speech Reorganization (ASR) system with utterance and topic segmenters based on a Maximum Entropy model. Recently, Hsueh and Moore [15] investigated the problem of automatically predicating segment boundaries in spoken multiparty dialogues using a lexical cohesion-based model.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Semantic passage segmentation based on sentence topics for question answering

Myaeng²,

Jang

2007

Information Sciences

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

“…paragraphs or section boundaries [36]. The second defines passages of fixed length [5,7,25]. The last approach uses semantic clues or topicality for identifying passages [2,14,33].…”

Section: Introductionmentioning

confidence: 99%

Semantic passage segmentation based on sentence topics for question answering

Myaeng²,

Jang

2007

Information Sciences

View full text Add to dashboard Cite

“…Linguistically oriented studies generally only examine small amounts of data, usually from a restricted domain (e.g., reading aloud of constructed examples or small domain task-oriented dialogues). However, a number of studies have successfully employed prosody and timing features for discourse segmentation using larger data sets, e.g., topic segmentation of broadcast news [26,27,28]. Unsurprisingly, features in these studies are usually based around the idea of prosodic reset and differences in pitch ranges, but quantified directly from the speech signal.…”

Section: The Prosody Of Discourse Segmentsmentioning

confidence: 99%

Paragraph-based prosodic cues for speech synthesis applications

Farrús

Lai²,

Moore³

2016

Speech Prosody 2016

View full text Add to dashboard Cite

Speech synthesis has improved in both expressiveness and voice quality in recent years. However, obtaining full expressiveness when dealing with large multi-sentential synthesized discourse is still a challenge, since speech synthesizers do not take into account the prosodic differences that have been observed in discourse units such as paragraphs. The current study validates and extends previous work by analyzing the prosody of paragraph units in a large and diverse corpus of TED Talks using automatically extracted F0, intensity and timing features. In addition, a series of classification experiments was performed in order to identify which features are consistently used to distinguish paragraph breaks. The results show significant differences in prosody related to paragraph position. Moreover, the classification experiments show that boundary features such as pause duration and differences in F0 and intensity levels are the most consistent cues in marking paragraph boundaries. This suggests that these features should be taken into account when generating spoken discourse in order to improve naturalness and expressiveness.

show abstract

“…Christensen et al [5] presented a maximum entropy approach to find utterance and topic boundaries in news broadcasts. Similar to our approach, they use cue words and the pause length as features to recognize utterance boundaries.…”

Section: Related Workmentioning

confidence: 99%

Automatic call section segmentation for contact-center calls

Park

2007

Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management

View full text Add to dashboard Cite

This paper presents a SVM (Support Vector Machine) classification system which divides contact-center call transcripts into "Greeting", "Question", "Refine", "Research", "Resolution", "Closing" and "Out-of-topic" sections. This call section segmentation is useful to improve search and retrieval functions and to provide more detailed statistics on calls. We use an off-the-shelf automatic speech recognition (ASR) system to generate call transcripts from recorded calls between customers and service representatives.We first classify an individual utterance into a call section by applying the SVM classifier and then merge adjacent utterances classified into a same call section. We experiment with the proposed system on 100 automatically transcribed calls. The 10-fold cross validation shows 87.2% classification accuracy. we also compare the proposed algorithm with two other approaches -the most frequent section only method and a maximum entropy-based segmentation. The evaluation shows that our system's accuracy is 12% higher than the first baseline system and 6% higher than the second baseline system respectively.

show abstract

Maximum entropy segmentation of broadcast news

Cited by 16 publications

References 5 publications

Semantic passage segmentation based on sentence topics for question answering

Semantic passage segmentation based on sentence topics for question answering

Paragraph-based prosodic cues for speech synthesis applications

Automatic call section segmentation for contact-center calls

Contact Info

Product

Resources

About