2018
DOI: 10.1016/j.csl.2017.06.005
|View full text |Cite
|
Sign up to set email alerts
|

Uncertainty weighting and propagation in DNN–HMM-based speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 32 publications
(8 citation statements)
references
References 9 publications
0
8
0
Order By: Relevance
“…It should be noted that the Viterbi path is the most probable state sequence based on the state transition probabilities and the probability of the observed values fitting the emissions distribution of a given state. While the Viterbi path is a useful tool for visualizing the state occurrences through time, there is, however, uncertainty associated with the state classification that has been addressed in studies elsewhere (Hernando et al 2005;Novoa et al 2018).…”
Section: Resultsmentioning
confidence: 99%
“…It should be noted that the Viterbi path is the most probable state sequence based on the state transition probabilities and the probability of the observed values fitting the emissions distribution of a given state. While the Viterbi path is a useful tool for visualizing the state occurrences through time, there is, however, uncertainty associated with the state classification that has been addressed in studies elsewhere (Hernando et al 2005;Novoa et al 2018).…”
Section: Resultsmentioning
confidence: 99%
“…Deep learning produces successful results for many problems that cause problems in the advancement of artificial intelligence methods. It is used in many areas such as image recognition [17][18][19], speech processing [20,21], and gene mutation prediction [22]. It is thought that DNN will make faster progress thanks to the new neural network structures proposed with an increase in computation capacity and amount of data [16].…”
Section: Deep Neural Network (Dnn)mentioning
confidence: 99%
“…The use of conventional hidden Markov models (HMMs) and deep neural networks (DNNs) of automatic speech recognition (ASR) systems in the preparation of a lexicon, acoustic models, and language models results in complications [1]. These approaches also require linguistic resources, such as a pronunciation dictionary, tokenization, and phonetic context dependencies [2]. In contrast, end-to-end ASR has grown to be a popular alternative to simplify the conventional ASR model building process.…”
Section: Introductionmentioning
confidence: 99%
“…The Amharic language mainly consists of seven vowels [18], namely ኧ[ә], ኡ[u], ኢ[i], ኣ[a], ኤ[e], እ [1], ኦ [o]. This language has 32 consonants [18,22] that are categorized based on their articulation stops (14), fricatives (8), affricatives (3), nasals (3), liquids (2), and glides (2). These consonants are indicated in Appendix B with their corresponding International Phonetic Alphabet (IPA) representations.…”
Section: Introductionmentioning
confidence: 99%