Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers

Minematsu,; Sekiguchi,; Hirose,

doi:10.1109/icassp.2002.1005695

Cited by 24 publications

(6 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Age estimation. Age estimation consists of automatically determining the age of a speaker in a given segment of the speech utterance [36]. We adopt 80-dimensional filter-bank features.…”

Section: Automatic Speech Recognitionmentioning

confidence: 99%

Didispeech: A Large Scale Mandarin Speech Corpus

Guo¹,

Cheng²,

Jiang³

et al. 2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours of speech data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All speech data in the corpus is recorded in quiet environment and is suitable for various speech processing tasks, such as voice conversion, multi-speaker text-to-speech and automatic speech recognition. We conduct experiments with multiple speech tasks and evaluate the performance, showing that it is promising to use the corpus for both academic research and practical application. The corpus is available at https://outreach.didichuxing.com/ research/opendata/.

show abstract

“…Age estimation. Age estimation consists of automatically determining the age of a speaker in a given segment of the speech utterance [36]. We adopt 80-dimensional filter-bank features.…”

Section: Automatic Speech Recognitionmentioning

confidence: 99%

Didispeech: A Large Scale Mandarin Speech Corpus

Guo¹,

Cheng²,

Jiang³

et al. 2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…the lungs, the size and shape of the resonant cavities, muscle response in the vocal apparatus, and many other such factors. Due to this, it is less possible to identify age from only voice of speaker [4,5,6]. 2) Handwriting/Signature examination: Handwriting based identification restricted to certain category of writers of specific language.…”

Section: ) Speech or Voice Examinationmentioning

confidence: 99%

Features and Methods of Human Age Estimation: Opportunities and Challenges in Medical Image Processing

Patil¹

2021

TURCOMAT

View full text Add to dashboard Cite

Age estimation of living species is an open and interesting problem due to its medico-legal importance and humans are no exception to this. Human body undergoes various physiological changes such as facial wrinkles, walking habits. Besides this, biological changes also help in human age estimation. Some of the changes are body skeleton and craniofacial growth. Various age estimation methods viz. manual, semi-automated and automated methods are available. Each of these methods has their merits and demerits. The popular manual and semi-automated age estimation methods are prone to human observation error and need sophisticated equipments. The advent of computational methods has opened new possibilities towards automation of the problem. Hence there is growing interest in fully automated methods. Through this paper, we have discussed different aspects of human age estimation and presented a brief review of various available methods.

show abstract

“…Unfortunately, the recording condition in these corpora is limited to telephone speech. Although some other speech corpora, such as KHUST [15], TIMIT [16], and JNAS [17], have also been used for speech age estimation tasks, these corpora only contain a limited number of speakers. Most of these corpora are also unavailable for free, complicating new research for speech age estimation.…”

Section: Introductionmentioning

confidence: 99%

Age-VOX-Celeb: Multi-Modal Corpus for Facial and Speech Estimation

Tawara

Ogawa

Kitagishi

et al. 2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Estimating a speaker's age from their speech is more challenging than age estimation from their face because of insufficiently available public corpora. To tackle this problem, we construct a new audio-visual age corpus named AgeVoxCeleb by annotating age labels to VoxCeleb2 videos. AgeVoxCeleb is the first large-scale, balanced, and multi-modal age corpus that contains both video and speech of the same speakers from a wide age range. Using AgeVox-Celeb, our paper makes the following contributions: (i) A facial age estimation model can outperform a speech age estimation model by comparing the state-of-the-art models in each task. (ii) Facial age estimation is more robust against the difference between training and test sets. (iii) We developed cross-modal transfer learning from face to speech age estimation, showing that the estimated age with a facial age estimation model can be used to train a speech age estimation model. Proposed AgeVoxCeleb will be published in https://github.com/nttcslab-sp/agevoxceleb.

show abstract

Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers

Cited by 24 publications

References 1 publication

Didispeech: A Large Scale Mandarin Speech Corpus

Didispeech: A Large Scale Mandarin Speech Corpus

Features and Methods of Human Age Estimation: Opportunities and Challenges in Medical Image Processing

Age-VOX-Celeb: Multi-Modal Corpus for Facial and Speech Estimation

Contact Info

Product

Resources

About