2014
DOI: 10.1109/lsp.2013.2295397
|View full text |Cite
|
Sign up to set email alerts
|

Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs

Abstract: Abstract-This letter investigates the use of MFCCs and GMMs for 1) improving the state of the art in speaker height estimation, and 2) rapid estimation of subglottal resonances (SGRs) without relying on formant and pitch tracking (unlike our previous algorithm in [1]). The proposed system comprises a set of height-dependent GMMs modeling static and dynamic MFCC features, where each GMM is associated with a height value. Furthermore, since SGRs and height are correlated, each GMM is also associated with a set o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(10 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…Most certainly, the computer has exceeded human ability of laymen when considering the domain of health state or pathology assessment from the voice and words such as when automatically diagnosing Autism Spectrum Condition [11], Alzheimer's [12] or Parkinson's disease [13]. Other examples exist such as predicting height [14] or heart rate [15] from voice acoustics down to some centimetres or beats per minute, where automatic approaches are likely a nodge ahead, albeit human perception tests for comparison are largely missing. Mostly in the psychological and phonetic literature, some do exist such as for excemptions for human age perception in speech such as [16], [17]) or speaker height such as [?].…”
Section: A Superhuman Yet?mentioning
confidence: 99%
“…Most certainly, the computer has exceeded human ability of laymen when considering the domain of health state or pathology assessment from the voice and words such as when automatically diagnosing Autism Spectrum Condition [11], Alzheimer's [12] or Parkinson's disease [13]. Other examples exist such as predicting height [14] or heart rate [15] from voice acoustics down to some centimetres or beats per minute, where automatic approaches are likely a nodge ahead, albeit human perception tests for comparison are largely missing. Mostly in the psychological and phonetic literature, some do exist such as for excemptions for human age perception in speech such as [16], [17]) or speaker height such as [?].…”
Section: A Superhuman Yet?mentioning
confidence: 99%
“…Many of these works employ Mel Frequency Cepstral Coefficients (MFCC) to process the recorded sound waves [1,20,24,30,35,40,43,54]. In point of fact, since the mid-eighties MFCC has been the most widely used feature extraction method in the field of ASR [19].…”
Section: Mel Frequency Cepstral Coefficientmentioning
confidence: 99%
“…Among various physical characteristics, scientific studies have investigated the correlation between voice characteristics and a speaker's age and height. Authors of [1,2] reported that the vocal tract length, sub-glottal resonance frequencies, and formant frequencies are correlated with the individual's height. Other voice characteristics of speech such as speech rate, sound pressure level, fundamental frequency, etc.…”
Section: Introductionmentioning
confidence: 99%