2016
DOI: 10.4081/audiores.2016.137
|View full text |Cite
|
Sign up to set email alerts
|

Automated Classification of Vowel Category and Speaker Type in the High-Frequency Spectrum

Abstract: The high-frequency region of vowel signals (above the third formant or F3) has received little research attention. Recent evidence, however, has documented the perceptual utility of high-frequency information in the speech signal above the traditional frequency bandwidth known to contain important cues for speech and speaker recognition. The purpose of this study was to determine if high-pass filtered vowels could be separated by vowel category and speaker type in a supervised learning framework. Mel frequency… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
1
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 14 publications
0
1
0
Order By: Relevance
“…This may have been due to the fact that since a simple frequency representation was used for classification, the highfrequency region of male vowels contained richer harmonic information (and additional information from which a classification decision could be made) due to lower fundamental frequencies in male signals. The overall findings of the current study are in line with those reported by Donai et al (2016), who reported accurate classification of six vowels produced by a limited number of speakers (two male, two female, and two children) using information above approximately 3.5 kHz.…”
Section: Discussionsupporting
confidence: 92%
See 2 more Smart Citations
“…This may have been due to the fact that since a simple frequency representation was used for classification, the highfrequency region of male vowels contained richer harmonic information (and additional information from which a classification decision could be made) due to lower fundamental frequencies in male signals. The overall findings of the current study are in line with those reported by Donai et al (2016), who reported accurate classification of six vowels produced by a limited number of speakers (two male, two female, and two children) using information above approximately 3.5 kHz.…”
Section: Discussionsupporting
confidence: 92%
“…In recent years, researchers have been increasingly interested in whether high-frequency cues are robust enough for classification methods. Several approaches have been used, including linear discriminant analyses (Donai and Paschall, 2015) and various machine recognition (e.g., Deshpande and Holambe, 2011;Donai et al, 2016;Itakura 1994, 1995) and segregation techniques (Hu and Wang, 2004). The results of these studies are encouraging, demonstrating that speech information above 3-4 kHz is useful for machine recognition tasks.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Computer vision deals with acquiring, processing, and understanding images in order to solve different tasks. Computer vision has a wide range of applications including video gaming [16], in the food industry [17], robotics [18,19,20], biomedical [21,22], and many more [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41].…”
Section: Introduction 11 Problem Definitionmentioning
confidence: 99%