2004
DOI: 10.1109/tsa.2004.828702
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Recognition of Spontaneous Speech for Access to Multilingual Oral History Archives

Abstract: Abstract-Much is known about the design of automated systems to search broadcast news, but it has only recently become possible to apply similar techniques to large collections of spontaneous speech. This paper presents initial results from experiments with speech recognition, topic segmentation, topic categorization, and named entity detection using a large collection of recorded oral histories. The work leverages a massive manual annotation effort on 10 000 h of spontaneous speech to evaluate the degree to w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
62
0

Year Published

2005
2005
2015
2015

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 98 publications
(63 citation statements)
references
References 24 publications
1
62
0
Order By: Relevance
“…This technique is used to adapt language models [23,24] as well as translation models [25,26] or their combination [27]. Similar approaches to domain adaptation are also applied in other tasks, e.g., automatic speech recognition [28].…”
Section: Domain Adaptationmentioning
confidence: 99%
“…This technique is used to adapt language models [23,24] as well as translation models [25,26] or their combination [27]. Similar approaches to domain adaptation are also applied in other tasks, e.g., automatic speech recognition [28].…”
Section: Domain Adaptationmentioning
confidence: 99%
“…The ASR process (with a 38% measured word error rate on held out data) was optimized for the interviewee rather than the interviewer (by automatically detecting and then consistently using only the interviewee's microphone) and was trained using 200 hours of in-domain held out data along with other other standard ASR training resources (Byrne et al, 2004). This resulted in the text contained in the ASRTEXT2004A field of each segment in the test collection.…”
Section: Test Collectionmentioning
confidence: 99%
“…Planned speech, such as broadcast news, is typically organized into topical segments, for example, news stories, that can be treated as documents within the retrieval system. Spontaneous, conversational speech is less well structured and topic can change spontaneously [2]. Within the speech stream topic boundaries are challenging to identify and may not be well defined.…”
Section: What Is Spontaneous Conversational Speech?mentioning
confidence: 99%