2013
DOI: 10.1121/1.4773255
|View full text |Cite
|
Sign up to set email alerts
|

Dimensional feature weighting utilizing multiple kernel learning for single-channel talker location discrimination using the acoustic transfer function

Abstract: This paper presents a method for discriminating the location of the sound source (talker) using only a single microphone. In a previous work, the single-channel approach for discriminating the location of the sound source was discussed, where the acoustic transfer function from a user's position is estimated by using a hidden Markov model of clean speech in the cepstral domain. In this paper, each cepstral dimension of the acoustic transfer function is newly weighted, in order to obtain the cepstral dimensions… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…τ is the index number of frequency bins of the short-term linear spectra in the n-th frame sound signal sequence, and L ′ express the length of the ATF in the STFT domain. However, the cost of such solution to estimate the frame sequence of the ATF is quite expensive [41]. Therefore, the estimated components of the ATF are too complex and it is difficult to deal with those parameters for this task.…”
Section: Estimation Of Source Distance With the Proposed Acoustic Modelmentioning
confidence: 99%
“…τ is the index number of frequency bins of the short-term linear spectra in the n-th frame sound signal sequence, and L ′ express the length of the ATF in the STFT domain. However, the cost of such solution to estimate the frame sequence of the ATF is quite expensive [41]. Therefore, the estimated components of the ATF are too complex and it is difficult to deal with those parameters for this task.…”
Section: Estimation Of Source Distance With the Proposed Acoustic Modelmentioning
confidence: 99%
“…From the artificial pinna, the sound source elevation is estimated by using the propagation transfer function and the neural network classifier [ 23 ]. The characteristics of indoor speech propagation were utilized for non-structural ML within a limited situation [ 24 , 25 , 26 ]. The parabolic structure with cepstral speech parameters was explored for position-dependent indoor ML [ 27 ].…”
Section: Introductionmentioning
confidence: 99%
“…The directivity pattern of the head-related transfer function was improved by exploring the various structures around the microphone [ 24 ]. Non-structural ML actively utilized the characteristics of indoor speech propagation for situation-related localization [ 25 , 26 , 27 ]. The indoor condition exhibited position-dependent Cepstral and speech parameters, which could be further enhanced by the parabolic reflection structure for the ML system [ 28 ].…”
Section: Introductionmentioning
confidence: 99%