Speech segmentation and spoken document processing

Ostendorf, Mari; Favre, Benoît; Grishman, Ralph; Hakkani‐Tür, Dilek; Harper, Mary P.; Hillard, Dustin; Hirschberg, Julia; Ji, Heng; Kahn, Jeremy G.; Liu, Yang; Maskey, Sameer; Matusov, Evgeny; Ney, Hermann; Rosenberg, Andrew; Shriberg, Elizabeth; Wang, Wen; Woofers, C.

doi:10.1109/msp.2008.918023

Cited by 42 publications

(40 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…They may be quite flexible, elliptic, restructured, and even incomplete (Blaauw, 1995). Structural metadata events (Liu et al, 2006;Ostendorf et al, 2008), i.e. punctuation marks and disfluencies, are being added to several corpora in EP, including CPE-FACES, in order to enrich automatic speech recognition outputs, for legibility purposes and also for the empirical study of interactions among different linguistic levels of analysis.…”

Section: Annotation Proceduresmentioning

confidence: 99%

Stylistic variation in the intonation of European Portuguese teenagers and adults

Mata

Moniz

Batista

2016

Issues in Hispanic and Lusophone Linguistics

View full text Add to dashboard Cite

The present study aims to investigate intonation contours in phrase-final position, in a corpus of spontaneous and prepared unscripted presentations from teenagers (14-15 years old) and adults, collected in a school context. Taking into account the differences between phrasing levels (ToBI breaks 3 and 4), we show that the frequency of low/falling vs. high/rising contours -mainly (H+)L* L and (L+)H* H -varies across oral presentation types. Adults and teenagers follow distinct strategies, though cross-gender differences are also a source of variation. We interpret these changes as an adaptation effect to the speaking styles specifically required at school, which call for the speaker´s effort to speak clearly and to keep the listeners attention, and ultimately as "intelligibility-oriented" speaking style changes.

show abstract

Section: Annotation Proceduresmentioning

confidence: 99%

Stylistic variation in the intonation of European Portuguese teenagers and adults

Mata

Moniz

Batista

2016

Issues in Hispanic and Lusophone Linguistics

View full text Add to dashboard Cite

show abstract

“…According to Ostendorf et al in [12], the segmentation of spoken languages can be divided into: audio diarization and structural segmentation. Audio diarization aims to distinguish speech from music through the grouping of acoustically homogeneous regions.…”

Section: Related Workmentioning

confidence: 99%

Speech and phoneme segmentation under noisy environment through spectrogram image analysis

Costa

Lopes

Mello

et al. 2012

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

View full text Add to dashboard Cite

This paper presents a new algorithm for speech segmentation based on image analysis of the spectrogram of the signal. The algorithm works in two loops: the first segments the sound in search for the speech signal. The segmented speech returns to the algorithm for phoneme segmentation. For evaluation, the algorithm was applied to TIMIT speech signals with correct speech segmentation of every tested signal, including signals under real-world noise.

show abstract

“…A principal motivação deste trabalho é a realização de uma tarefa de classificação automática multiclasse para determinar, com base nas propriedades prosódicas das palavras, quais as que são marcadores discursivos, quais as que são disfluências e quais as que são constituintes similares a frases (do inglês sentence like-units, SUs). No domínio do processamento automático de fala, as marcas de pontuação, que delimitam SUs, as disfluências e os marcadores discursivos fazem parte de um conjunto de eventos designados no inglês structural metadata events (Liu et al, 2006;Ostendorf et al, 2008). Nesta linha de análise, pretende-se recuperar automaticamente a pontuação e as maiúsculas em fronteiras de frase, bem como a anotação e filtragem de disfluências e de marcadores.…”

Section: Introductionunclassified

“…A identificação de marcas de pontuação (Batista, 2011;Batista et al, 2012;Moniz, 2013) e de disfluências (e.g., pausas preenchidas lexicalizadas, como "aam" e/ou "mm", apagamentos, substituições, entre outros 1 ) nas transcrições já permitiu uma melhoria significativa do output do sistema, o que resultou numa diminuição da taxa de erro de reconhecimento (Moniz et al, 2014b). Com a recente disponibilização de uma grande quantidade de corpora de fala espontânea, foi possível analisar os eventos em falta, nomeadamente, os marcadores discursivos (Liu et al, 2006;Ostendorf et al, 2008), tópico do presente estudo.…”

Section: Introductionunclassified

Classificação prosódica de marcadores discursivos

Cabarrão¹,

Moniz²,

Ferreira³

et al. 2016

rapl

View full text Add to dashboard Cite

This work describes the discourse markers present in two corpora for European Portuguese, in different domains (university lectures and map-task dialogues). In this study, we also perform a multiclass automatic classification task based on prosodic features to verify in both corpora which words are discourse markers, which are disfluencies, and which are sentence like-units (SUs). Results show that the selection of discourse markers varies across domain and between speakers. As for the classification task, results show that the discourse markers are better classified in the lectures corpus (87%) than in the dialogue corpus (84%). However, cross-domain experiments evidenced that data trained with the dialogue corpus predicts better the events in the lecture corpus, since this domain displays more speakers and therefore complex patterns. In both corpora, markers are more easily classified as SUs than as disfluencies.

show abstract

Speech segmentation and spoken document processing

Cited by 42 publications

References 14 publications

Stylistic variation in the intonation of European Portuguese teenagers and adults

Stylistic variation in the intonation of European Portuguese teenagers and adults

Speech and phoneme segmentation under noisy environment through spectrogram image analysis

Classificação prosódica de marcadores discursivos

Contact Info

Product

Resources

About