The speaker diarization is considered to be the process by which the speaker signal is segmented, and the speaker identity is grouped into homogenous regions. The central point behind this scheme is the ability to distinguish between the speaker signal and each speaker signal with the label. As mass communication and meetings grow quickly, the diarization of the speakers is burden to improve the readability of the speech transcript. To solve this problem, tangent weighted mel‐frequency cepstral coefficient (TMFCC) and the extended linear prediction with autocorrelation snapshot feature extraction and the speaker diarization approach proposes a deep convolutional neural network (DCNN) for clustering and optimization using sailfish optimizer. A new development in the HXLPS extraction method is the holoentropy with extended linear prediction with autocorrelation snapshot. TMFCC makes more efficient and improves the effectiveness of the proposed scheme using lesser energy frame and higher energy framework. When achieve this, the voice activity detection method can recognize speech and non‐speech signals. Therefore, every segmented signal is represented by the d‐vector. The label of the speaker signal is clustered according to the speaker label used in the DCNN. The evaluation methods, like tracking distance, false alarm rate, diarization error rate examine the effectiveness.
Speech is the most important communication among humans. Processing of speech signal has many strategies including speech coding, speaker recognition, speaker verification, etc. Speaker diarization is the pre-processing stage for many applications of speaker recognition systems. Speaker Diarization is the mission of determining “who Spoke when” for any audio recording that carries an unknown quantity of records and an unknown variety of audio systems. Speaker diarization has come to be achief era for many tasks like navigation, retrieval, or higher-level interference on audio data. It mainly performs three operations feature extraction, voice activity detection, and classification. In this paper, we’ve reviewed the few speaker diarization Techniques. The trendy speaker diarization structures finished nice outcomes. In this paper, few speaker diarization device performances are evaluated for Diarization mistakes, Tracking time, and False alarm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.