2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS) 2022
DOI: 10.1109/icaccs54159.2022.9785196
|View full text |Cite
|
Sign up to set email alerts
|

Dilated Convolution and MelSpectrum for Speaker Identification using Simple Deep Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…These features harbor discriminative properties tailored to individual utterances, encapsulating their intrinsic characteristics. Regarded as a data reduction step, feature extraction condenses lengthy utterances into compact data that encapsulates the core attributes of the speaker [1].…”
Section: Speaker Identification Processmentioning
confidence: 99%
See 1 more Smart Citation
“…These features harbor discriminative properties tailored to individual utterances, encapsulating their intrinsic characteristics. Regarded as a data reduction step, feature extraction condenses lengthy utterances into compact data that encapsulates the core attributes of the speaker [1].…”
Section: Speaker Identification Processmentioning
confidence: 99%
“…Speech has been a fundamental mode of human communication since ancient times, arising from vocal tract excitation. Physiological attributes contributing to speech differ across individuals, including variations in vocal tract size, shape, vocal fold structure, velum, and nasal cavity, especially between genders [1][2][3].…”
Section: Introductionmentioning
confidence: 99%
“…Across the several methods in Table 2, the proposed method improves the accuracy by 7.61% compared to [22] baseline method for 10 speakers. The accuracy of the proposed method further improves by nearly 8% and 14% compared to the baseline methods [27] and [6] respectively. The proposed work does not introduce computational complexity since extracting the logmelspectrum and excitation features have been done offline.…”
mentioning
confidence: 94%
“…Speaker recognition can be divided into text-dependent and Textindependent systems. Text-dependent modules work with the same set of phrases in training and testing [6]. Whereas in text-independent systems, there are no such limitations on text phrases in training and testing.…”
Section: Introductionmentioning
confidence: 99%