2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012
DOI: 10.1109/icassp.2012.6288828
|View full text |Cite
|
Sign up to set email alerts
|

N-best entropy based data selection for acoustic modeling

Abstract: This paper presents a strategy for efficiently selecting in formative data from large corpora of untranscribed speech. Confidence-based selection methods (Le., selecting utter ances we are least confident about) have been a popular approach, though they only look at the top hypothesis when selecting utterances and tend to select outliers, therefore, not always improving overall recognition accuracy. Alternatively, we propose a method for selecting data looking at compet ing hypothesis by computing entropy of N… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 23 publications
(20 citation statements)
references
References 13 publications
0
20
0
Order By: Relevance
“…Similarly, according to the distribution of context-dependent HMM states in a development set, Siohan [28], [29] proposed to select data for acoustic modeling. Itoh et al [27] suggested that when selecting acoustic data, the informativeness and representativeness of the data should be assessed at the same time.…”
Section: Target Language Acoustic Data Selectionmentioning
confidence: 99%
“…Similarly, according to the distribution of context-dependent HMM states in a development set, Siohan [28], [29] proposed to select data for acoustic modeling. Itoh et al [27] suggested that when selecting acoustic data, the informativeness and representativeness of the data should be assessed at the same time.…”
Section: Target Language Acoustic Data Selectionmentioning
confidence: 99%
“…[6] proposed lattice-entropy based measure and selecting utterances based on global entropy reduction. [7] observed that latticeentropy is correlated with the utterance length and showed Nbest entropy to be an empirically better criterion. In this work, we also use a entropy-based measure as informative criterion for data selection.…”
Section: Uncertainty Based Informativeness Criterionmentioning
confidence: 99%
“…The difference in cross-entropy is used a measure of relevance and the average entropy based on confusion networks is used as a measure of uncertainty or informativeness. Both the scores are in log-scale and we use a simple weighted combination to combine both the scores [7]. The final score in given by…”
Section: Score Combinationmentioning
confidence: 99%
See 2 more Smart Citations