Probabilistic Decision-Based Neural Networks (PDBNNs) can be considered as a special form of Gaussian Mixture Models (GMMs) with trainable decision thresholds. This paper provides detailed illustrations to compare the recognition accuracy and decision boundaries of PDBNNs with that of GMMs through two pattern recognition tasks, namely the noisy XOR problem and the classification of two-dimensional vowel data. The paper highlights the strengths of PDBNNs by demonstrating that their thresholding mechanism is very effective in detecting data not belonging to any known classes. The original PDBNNs use elliptical basis functions with diagonal covariance matrices, which may be inappropriate for modelling feature vectors with correlated components. This paper overcomes this limitation by using full covariance matrices, and showing that the matrices are effective in characterising non-spherical clusters.
To improve the reliability of telephone-based speaker verification systems, channel compensation is indispensable. However, it is also important to ensure that the channel compensation algorithms in these systems surpress channel variations and enhance interspeaker distinction. This paper addresses this problem by a blind feature-based transformation approach in which the transformation parameters are determined online without any a priori knowledge of channel characteristics. Specifically, a composite statistical model formed by the fusion of a speaker model and a background model is used to represent the characteristics of enrollment speech. Based on the difference between the claimant's speech and the composite model, a stochastic matching type of approach is proposed to transform the claimant's speech to a region close to the enrollment speech. Therefore, the algorithm can estimate the transformation online without the necessity of detecting the handset types. Experimental results based on the 2001 NIST evaluation set show that the proposed transformation approach achieves significant improvement in both equal error rate and minimum detection cost as compared to cepstral mean subtraction, Znorm, and short-time Gaussianization.
Feature transformation aims to reduce the effects of channel-and handset-distortion in telephone-based speaker verification. This paper compares several feature transformation techniques and evaluates their verification performance and computation time under the 2000 NIST speaker recognition evaluation protocol. Techniques compared include feature mapping (FM), stochastic feature transformation (SFT), blind stochastic feature transformation (BSFT), feature warping (FW), and short-time Gaussianization (STG). The paper proposes a probabilistic feature mapping (PFM) in which the mapped features depend not only on the top-1 decoded Gaussian but also on the posterior probabilities of other Gaussians in the root model. The paper also proposes speeding up the computation of PFM and BSFT parameters by considering the top few Gaussians only. Results show that PFM performs slightly better than FM and that the fast approach can reduce computation time substantially. Among the approaches investigated, the fast BSFT (fBSFT) strikes a good balance between computational complexity and error rates, and FW and STG are the best in terms of error rates but with higher computational complexity. It was also found that fusion of the scores derived from systems using fBSFT and STG can reduce the error rate further. This study advocates that fBSFT, FW, and STG have the highest potential for robust speaker verification over telephone networks because they achieve good performance without any a priori knowledge of the communication channel.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.