1976
DOI: 10.1121/1.2003011
|View full text |Cite
|
Sign up to set email alerts
|

Continuous speech recognition via centisecond acoustic states

Abstract: Continuous speech was treated as if produced by a finite-state machine making a transition every centisecond. The observable output from state transitions was considered to be a power spectrum—a probabilistic function of the target state of each transition. Using this model, observed sequences of power spectra from real speech were decoded as sequences of acoustic states by means of the Viterbi trellis algorithm. The finite-state machine used as a representation of the speech source was composed of machines re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
58
0
2

Year Published

2002
2002
2012
2012

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 129 publications
(61 citation statements)
references
References 0 publications
1
58
0
2
Order By: Relevance
“…Firstly, not only is our training algorithm able to find the optimised number of states to represent the image, it also outperforms those trained by the Baum Welch Algorithm, since it has a higher average maximum likelihood rate. Secondly, the GA-HMM confirms the observation made by Bakis [14] indicating that the number of states of the HMM should corresponds to the number of observation in a sequence.…”
Section: Experiments Performancesupporting
confidence: 52%
See 1 more Smart Citation
“…Firstly, not only is our training algorithm able to find the optimised number of states to represent the image, it also outperforms those trained by the Baum Welch Algorithm, since it has a higher average maximum likelihood rate. Secondly, the GA-HMM confirms the observation made by Bakis [14] indicating that the number of states of the HMM should corresponds to the number of observation in a sequence.…”
Section: Experiments Performancesupporting
confidence: 52%
“…For each candidate solution, a number of states, which is an integer between 4 and 11, was randomly generated, since according to Bakis [14], the number of states would usually be identical to the number of observation sequences. In this work, 9 observation sequences are used to represent the various subimages, thus the least number of states is set to 4 and the maximum number of states to 11.…”
Section: Initial Populationmentioning
confidence: 99%
“…In a second step, the conditional probability p(j(k i )|j(k i−1 )) is replaced by the Bakis model known from speech recognition [5]: Fig. 2 shows the different values δ can take.…”
Section: The Conditional Modelmentioning
confidence: 99%
“…1(b) on the other hand shows two left-right models which are connected in a cross-coupled manner. The other models, A 4-state leftright or Bakis model [17,18] is also shown in Fig. 1(c).…”
Section: Introductionmentioning
confidence: 99%
“…of models like pattern [1,2], DNA sequence analysis [3,4], pathologies [5] or speech recognition [6,7]. The basic theory of HMM was published in a series of classic papers by Baum and his colleagues in the late 1960s and early 1970s which was then implemented for speech recognition applications by Baker at Carnegie Mellon University (CMU) and by Jelinek and his colleagues at IBM in the 1970s [8][9][10][11][12][13][14][15][16][17][18][19][20] and further explored by L. Rabiner, et al [21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36] in 1980s and the early 1990s.…”
Section: Introductionmentioning
confidence: 99%