2012
DOI: 10.1587/transinf.e95.d.614
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian Learning of a Language Model from Continuous Speech

Abstract: SUMMARYWe propose a novel scheme to learn a language model (LM) for automatic speech recognition (ASR) directly from continuous speech. In the proposed method, we first generate phoneme lattices using an acoustic model with no linguistic constraints, then perform training over these phoneme lattices, simultaneously learning both lexical units and an LM. As a statistical framework for this learning problem, we use non-parametric Bayesian statistics, which make it possible to balance the learned model's complexi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
85
0

Year Published

2012
2012
2020
2020

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 48 publications
(85 citation statements)
references
References 32 publications
0
85
0
Order By: Relevance
“…The word segmentation part needs to allow for variability in the phonetic realization of words, but can also provide top-down pressure for sounds in similar contexts to be labeled as the same phone. Steps in this direction have been taken [16,31,32], but we are aware of no fully integrated system using speech data as input. It might also be possible to improve results on automatic tokenizations by identifying each token as, say, consonantal or vocalic.…”
Section: Evaluation and Discussionmentioning
confidence: 99%
“…The word segmentation part needs to allow for variability in the phonetic realization of words, but can also provide top-down pressure for sounds in similar contexts to be labeled as the same phone. Steps in this direction have been taken [16,31,32], but we are aware of no fully integrated system using speech data as input. It might also be possible to improve results on automatic tokenizations by identifying each token as, say, consonantal or vocalic.…”
Section: Evaluation and Discussionmentioning
confidence: 99%
“…These algorithms, however, don't consider recognition errors in the phoneme sequences as we do. Word discovery on noisy phoneme lattices was also considered in [41], using similar methods. A comparison in [26] showed greatly improved F-scores of our proposed method compared to [41] for word n-grams greater than 1.…”
Section: Resultsmentioning
confidence: 99%
“…The second level is targeted at the discovery of the lexical units, the words, and learning their probabilities, the language model, from the phoneme sequences of the first level [55,41,25,26,39]. In speech recognition, the mapping of words to phoneme sequences is typically determined by a pronunication lexicon.…”
Section: Representation Learning From Sequential Datamentioning
confidence: 99%
See 1 more Smart Citation
“…This is achieved using blocked Gibbs sampling, with each utterance constituting one block. To sample from WFSTs, we use forwardfiltering/backward-sampling (Scott, 2002;Neubig et al, 2012), creating forward probabilities using the forward algorithm for hidden Markov models before backward-sampling edges proportionally to the product of the forward probability and the edge weight. 3 3 No Metropolis-Hastings rejection step was used.…”
Section: Inferencementioning
confidence: 99%