2009
DOI: 10.1109/tasl.2008.2012323
|View full text |Cite
|
Sign up to set email alerts
|

Importance of High-Order N-Gram Models in Morph-Based Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
56
1
2

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 80 publications
(59 citation statements)
references
References 25 publications
0
56
1
2
Order By: Relevance
“…N-gram models are widely used in statistical natural language processing and speech recognition. An ngram is a sub-sequence of n overlapping items (characters, letters, words, etc) from a given sequence [15][16]. For example, the result of the application of 2-gram character model to the string "benign" is "be", "en", "ni", "ig", "gn".…”
Section: B N-gram Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…N-gram models are widely used in statistical natural language processing and speech recognition. An ngram is a sub-sequence of n overlapping items (characters, letters, words, etc) from a given sequence [15][16]. For example, the result of the application of 2-gram character model to the string "benign" is "be", "en", "ni", "ig", "gn".…”
Section: B N-gram Modelsmentioning
confidence: 99%
“…[15] The extension combines Breiman's "bagging" idea and random selection of features, introduced first by Ho [1] and later independently by Amit and Geman [16] in order to construct a collection of decision trees with controlled variance.…”
Section: A Random Forestmentioning
confidence: 99%
“…Acoustic model parameters are trained on 30-hours of data drawn from the Finnish SPEECON database, consisting of clean speech utterances recorded with a headset in quiet environments (SNR 16-44 dB). The decoder is a time-synchronous beam-pruned Viterbi token-pass system, and the language model is a morph-based growing n-gram model (Hirsimäki et al, 2009) trained on Finnish book and newspaper data with 145 million words. The vocabulary is in practice unlimited, since all words and word forms can be represented with statistical morphs (Hirsimäki et al, 2006).…”
Section: Baseline Systemmentioning
confidence: 99%
“…An n-gram is a sub-sequence of n overlapping items (characters, letters, words, etc) from a given sequence [16]. N-gram sequences are then used to construct n-gram frequency vectors, which express the frequency of appearance of every n-byte and n-opcode.…”
Section: Feature Extractionmentioning
confidence: 99%