Automatic speaker recognition of spanish siblings: (monozygotic and dizygotic) twins and non-twin brothers

Segundo, Eugenia San; Künzel, Hermann J.

doi:10.3989/loquens.2015.021

Cited by 15 publications

(16 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The results from these experiments show that identical twins consistently pose a greater problem to the distinguishing capabilities of voice matching algorithms than non-related individuals as corroborated by Segundo et al [42]. Due to the nature of the present data, it was not possible to replicate all previous works' experiments.…”

Section: Do Twins Pose a Problem To Speaker Verification?supporting

confidence: 49%

See 1 more Smart Citation

A Long Short-term Memory Neural Network for Improved Twins' Voice Differentiation

Sabatier¹

View full text Add to dashboard Cite

A Long Short-Term Memory Neural Network for Improved Twins' Voice Differentiation Stallone Bruno-Ray Sabatier Although successfully implemented in certain situations, the reliability of speaker recognition (SR) decreases due to speaker and channel variability present between enrollment and evaluation samples, as well as the available length of speech utterances. The issue of speaker variability becomes more pronounced when the number of speakers in the evaluation set increases, as there is a higher probability that two voices may sound more alike. The differentiation of intra-twin pairs' voices can be beneficial to general SR because inclusion can simulate this effect. This is due to the fact that their shared voice similarities are comparable to the potential inter-speaker similarities one would expect in a large database of enrolled speakers. Furthermore, twin occurrence has steadily increased over the past thirty years. With these considerations in mind, there have been few research efforts analyzing the impact of identical twins on SR, and they have been lacking in terms of the corpus of individuals sampled and/or the technology employed. In this research effort, a recurrent neural network that specializes in processing time series data, specifically the long short-term memory (LSTM) network, is evaluated on a large corpus of identical twins' speech collected over two years with multiple speaking modes. The LSTM's recurrent capability enables the exploitation of higher level speech features which are hypothesized to be more variant between identical twins. The LSTM is configured as a single network, and in a Siamese fashion, to evaluate the performance of varied utterance lengths and speech features. Matching results are analyzed and discussed in comparison to state of the art i-vector methodologies. Results in terms of the equal error rate (EER) indicate that, as the length of the enrollment and test utterances are reduced, the LSTM outperforms the i-vector system by 17% for two seconds of speech data. Of the three speech features investigated in the Siamese configuration, mel-frequency spectral coefficients resulted in the highest rate of twin voice differentiation with an EER of 8.57% for six seconds of data. Lastly, in comparison to other twins' voice studies, the introduction of more individuals degrades performance with respect to male speakers from nearly 0% to 0.598% EER. However, the results from these experiments are far less than female trials in other studies. I dedicate this work to my family starting with my mother, Lela, my stepfather , Chris, my father, Bruno and my grandmother and grandfather, Rosemary and Edward, for their love, patience and continued support in making this research effort a reality.

show abstract

Section: Do Twins Pose a Problem To Speaker Verification?supporting

confidence: 49%

“…The only study to investigate not only MZ twins but also DZ twins and their siblings was conducted by San Segundo and Kunzel [42]. The same set of matching experiments as in [40] [43], adjustment of speaker models [44] or score normalization [45].…”

Section: Automatic Speaker Recognition Studies On Twinsmentioning

confidence: 99%

A Long Short-term Memory Neural Network for Improved Twins' Voice Differentiation

Sabatier¹

View full text Add to dashboard Cite

show abstract

“…The purpose of automatic speaker recognition is to identify a person from pronounced speech, this work can be further turned to the process of the identification or verification where the first matching is 1: 1 (claiming an identity and matcher task is to accept or reject the individual), or the latter, which is a 1: N match (the identification of speaker depends on making comparison to N registered speakers) [12]. Fig.…”

Section: Background Theory a Speaker Recognitionmentioning

confidence: 99%

“…Seemingly, text-independent is seen much more difficult task to achieve in speaker recognition system. Generally, choosing the features of any Speaker Recognition system is regarded a key matter in order to obtain an accurate performance, where this system must have high variability between speakers and low variability within each person,in addition to other standards [12].…”

Section: Background Theory a Speaker Recognitionmentioning

confidence: 99%

“…The traits are described of being environmentally-influenced that anyone can acquire them by education, place of living, and the social or family environment, socio-economic status. Remarkably, these features could Figure 1: Speaker verification diagram adapted from [12] have many advantages and the most striking one is robustness to channel effects and noise and. However, they entail advanced extraction techniques [14].…”

Section: Background Theory a Speaker Recognitionmentioning

confidence: 99%

See 1 more Smart Citation

Arabic Speech Recognition Based on Knn, J48, and LVQ

Alwahed

Jawad

2019

Iraqi Journal of ICT

View full text Add to dashboard Cite

Most systems of speaker recognition work on speech feature primarily classified of being a low level which considerably relies on speaker physical characteristics and, to the lower extent, the acquired speaking habits. In this paper present a system to recognition and identification in Arabic speaker. It includes two phases (training phase and testing phase) each phase includes the using of audio features (Mean, Standard Division, Zero Crossing, Amplitude). after get the feature, the recognition step is using (J48, KNN, LVQ),) where the Nearest Neighbor (KNN) applied o get the similarity of the data training and data testing , LVQ neural network used for Speech Recognition and Arabic language Identification. This sentence contains words especially kidnappings and kidnappers are ten sentences and pronounce these sentences by 10 people, five men and five women of different ages and each of the ten pronunciation of all sentences, so a total of 100 samples and the samples were recorded on audio and wave. The results of the sentences pronounced by women are higher than the results of the same sentences pronounced by men. They achieved better recognition rate 85, 93, 96.4%

show abstract

Twin identification from speech: linear and non-linear cepstral features and models

Revathi

Nagakrishnan

Sasikaladevi

2020

Int J Speech Technol

View full text Add to dashboard Cite

Automatic speaker recognition of spanish siblings: (monozygotic and dizygotic) twins and non-twin brothers

Cited by 15 publications

References 48 publications

A Long Short-term Memory Neural Network for Improved Twins' Voice Differentiation

A Long Short-term Memory Neural Network for Improved Twins' Voice Differentiation

Arabic Speech Recognition Based on Knn, J48, and LVQ

Twin identification from speech: linear and non-linear cepstral features and models

Contact Info

Product

Resources

About