2003
DOI: 10.1016/s0167-6393(02)00048-1
|View full text |Cite
|
Sign up to set email alerts
|

Clustering of triphones using phoneme similarity estimation for the definition of a multilingual set of triphones

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2003
2003
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 13 publications
0
8
0
Order By: Relevance
“…TIMIT has been chosen because it includes accurate time-aligned phonetic transcriptions, meaning that both phonetic labels and their start/end times are known. As our desired clusters we use triphones, which are phones in specific left and right contexts [27]. We consider triphones that occur at least 20 times and at most 25 times in the corpus.…”
Section: Datamentioning
confidence: 99%
“…TIMIT has been chosen because it includes accurate time-aligned phonetic transcriptions, meaning that both phonetic labels and their start/end times are known. As our desired clusters we use triphones, which are phones in specific left and right contexts [27]. We consider triphones that occur at least 20 times and at most 25 times in the corpus.…”
Section: Datamentioning
confidence: 99%
“…Several similarity measure proposals can be found in literature [4,16,25], but as referred in [26], they do not fulfil the properties of a proper metric. In the present proposal, d 1 has the three metric properties: it is positive, symmetric and satisfies the triangle inequality.…”
Section: Similarity Measurementioning
confidence: 99%
“…The key point of all clustering algorithms is the choice of a proximity or distance measure. This measure can be obtained from acoustic models, e.g., [3,15], or even rely on the confusion matrix, e.g., [4,16]. Model-driven methods and confusiondriven a methods are then the two major categories of datadriven phone clustering algorithms.…”
Section: Introductionmentioning
confidence: 99%
“…In data-driven methods, similarity between phonetic units across languages is commonly estimated by evaluating the distance of their language-dependent acoustic models (i.e. HMMs) using agglomerative (Kö hler, 2001;Salvi, 2003;Imperl et al, 2003), decision tree based (Schultz and Waibel, 2001), or a combination of decision tree and agglomerative (Mariñ o et al, 2000) clustering algorithms. Other data-driven approaches find the similarity between phones by means of a confusion matrix (Byrne et al, 2000).…”
Section: Introductionmentioning
confidence: 99%
“…Other data-driven approaches find the similarity between phones by means of a confusion matrix (Byrne et al, 2000). Measuring similarity between language-context-dependent phonetic units, such as demiphones (Mariñ o et al, 2000), triphones (Imperl et al, 2003) or pentaphones (Schultz and Waibel, 2001) provide better recognition results than measuring similarity between language-context-independent units. In addition, (Imperl et al, 2003) conclude that although an agglomerative clustering algorithm yields a limited number of clusters, the decision tree method gives better recognition results and solves modeling units that are not seen in the training data.…”
Section: Introductionmentioning
confidence: 99%