2010 5th International Symposium on Telecommunications 2010
DOI: 10.1109/istel.2010.5734096
|View full text |Cite
|
Sign up to set email alerts
|

Determination of pitch range based on onset and offset analysis in modulation frequency domain

Abstract: Abstract-Auditory scene in a natural environment contains multiple sources. Auditory scene analysis (ASA) is the process in which the auditory system segregates a scene into streams corresponding to different sources. The determination of range of pitch frequency is necessary for segmentation. We propose a system to determine the range of pitch frequency by analyzing onsets and offsets in modulation frequency domain. In the proposed system, first the modulation spectrum of speech is calculated and then, in eac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 16 publications
0
1
0
Order By: Relevance
“…Likewise, in speech technologies such as automatic speech recognition/understanding and automatic assessment of pronunciation, F0 normalization is very important and hence an automatic estimation of the speaker-specific F0 range is necessary, especially for tone languages where F0 plays more important roles than in non-tone languages [9,10]. However, in most previous studies a calculation of F0 range is accomplished by a direct statistical analysis of F0 variation from a lengthy speech input [10][11][12][13], which apparently does not coincide with the way human listeners estimate F0 range in the condition of a very brief speech input. To mimic the auditory mechanism of human listeners, a recent study of ours [14] proposed a model of estimating F0 range from spectral features.…”
Section: Introductionmentioning
confidence: 99%
“…Likewise, in speech technologies such as automatic speech recognition/understanding and automatic assessment of pronunciation, F0 normalization is very important and hence an automatic estimation of the speaker-specific F0 range is necessary, especially for tone languages where F0 plays more important roles than in non-tone languages [9,10]. However, in most previous studies a calculation of F0 range is accomplished by a direct statistical analysis of F0 variation from a lengthy speech input [10][11][12][13], which apparently does not coincide with the way human listeners estimate F0 range in the condition of a very brief speech input. To mimic the auditory mechanism of human listeners, a recent study of ours [14] proposed a model of estimating F0 range from spectral features.…”
Section: Introductionmentioning
confidence: 99%