2007 IEEE Workshop on Automatic Speech Recognition &Amp; Understanding (ASRU) 2007
DOI: 10.1109/asru.2007.4430116
|View full text |Cite
|
Sign up to set email alerts
|

Recognition and understanding of meetings the AMI and AMIDA projects

Abstract: The AMI and AMIDA projects are concerned with the recognition and interpretation of multiparty meetings. Within these projects we have: developed an infrastructure for recording meetings using multiple microphones and cameras; released a 100 hour annotated corpus of meetings; developed techniques for the recognition and interpretation of meetings based primarily on speech recognition and computer vision; and developed an evaluation framework at both component and system levels. In this paper we present an over… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
69
0
6

Year Published

2009
2009
2017
2017

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 111 publications
(75 citation statements)
references
References 49 publications
0
69
0
6
Order By: Relevance
“…It is a suitable collection for our experiments since investigation of the global semantic impact of speech recognition error requires reliable reference transcripts for the complete spoken document collection. As is typical for conversational speech, the word error rate for the corpus ranges up to around 40% [8]. We use the speaker turn segmentation provided with the corpus to divide the data into documents.…”
Section: Datamentioning
confidence: 99%
See 2 more Smart Citations
“…It is a suitable collection for our experiments since investigation of the global semantic impact of speech recognition error requires reliable reference transcripts for the complete spoken document collection. As is typical for conversational speech, the word error rate for the corpus ranges up to around 40% [8]. We use the speaker turn segmentation provided with the corpus to divide the data into documents.…”
Section: Datamentioning
confidence: 99%
“…We use the AMI Meeting Corpus (release 1.4) [8], which consists of 100 hours of multimodal data recorded from scenario-based meetings. Included in the corpus are automatic speech recognition transcripts and human-generated reference transcripts.…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Discriminative non-linear feature transformations can provide yet further gains in performance, because the transformation is optimized to reduce the error rate in the context of the decoder (e.g., [18]). Some of the popular non-linear transforms provide an approximately piece-wise linear transform by the inclusion of "regionbased" features based on Gaussian posterior probabilities.…”
Section: Introductionmentioning
confidence: 99%
“…Speaker adaptation methods such as SAT and fMLLR were originally developed for decreasing the variation between speakers, but they are also known to improve the ASR accuracy in noisy environments by adapting to unknown and changing noise conditions in effect, performing noise adaptive training [12], [25], [39]. Discriminative non-linear feature transformations can provide yet further gains in performance, because the feature transformation is optimized to reduce directly the error rates of the decoder [33].…”
Section: Introductionmentioning
confidence: 99%