2013
DOI: 10.1007/978-3-642-34425-1
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Neural Network Structures for Phoneme Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…For instance, there is no pair of separate English words which are identical except that where one member of the pair has breathy [p] where the other has non breathy [p]. (pp.81-82) Vasquez, Gruhn, and Minker (2012) explain "phonemes are realized as individual sounds (phones) i.e., phones are instances of phonemes. Each phoneme is characterized by a prototype which mainly describes the way phones are produced (articulation).…”
Section: Discussionmentioning
confidence: 99%
“…For instance, there is no pair of separate English words which are identical except that where one member of the pair has breathy [p] where the other has non breathy [p]. (pp.81-82) Vasquez, Gruhn, and Minker (2012) explain "phonemes are realized as individual sounds (phones) i.e., phones are instances of phonemes. Each phoneme is characterized by a prototype which mainly describes the way phones are produced (articulation).…”
Section: Discussionmentioning
confidence: 99%
“…As a middle ground between assuming gold phonetic transcriptions (cf. Section 6) and no transcriptions at all, we use noisy transcriptions by running speech recognizers for other languages on the Spanish speech: Russian (ru), Hungarian (hu) and Czech (cz) (Vasquez et al, 2012). These distantly related languages were chosen to be a better approximation to the low-resource scenario.…”
Section: Alignment Evaluationmentioning
confidence: 99%
“…Some authors observed that the posterior estimates obtained can be "enhanced" by training yet another network-but this time on a sequence of output vectors coming from the first network [22]. Other authors refer to this approach as the "hierarchical modeling" [23][24][25] or the "stacked modeling" method [26]. Two trivial improvements to this approach are when the upper net downsamples the output of the lower one [24,27] and/or when it uses the output of some bottleneck layer instead of the uppermost softmax layer [25,26].…”
Section: Introductionmentioning
confidence: 99%
“…Other authors refer to this approach as the "hierarchical modeling" [23][24][25] or the "stacked modeling" method [26]. Two trivial improvements to this approach are when the upper net downsamples the output of the lower one [24,27] and/or when it uses the output of some bottleneck layer instead of the uppermost softmax layer [25,26]. Veselý's proposal was to treat this hierarchical construct as one joint model, and he also explained why the compound structure can be interpreted as a deep convolutional network [21].…”
Section: Introductionmentioning
confidence: 99%