Xu Xiang scite author profile

Xu Xiang

5Publications

95Citation Statements Received

43Citation Statements Given

How they've been cited

241

How they cite others

Affiliations

Fudan University, Shanghai Jiao Tong University, Shanghai Municipal Education Commission

Publications

Order By: Most citations

Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Xiang

Wang

Huang³

et al. 2019

108

View full text Add to dashboard Cite

Recently, speaker embeddings extracted from a speaker discriminative deep neural network (DNN) yield better performance than the conventional methods such as i-vector. In most cases, the DNN speaker classifier is trained using cross entropy loss with softmax. However, this kind of loss function does not explicitly encourage inter-class separability and intraclass compactness. As a result, the embeddings are not optimal for speaker recognition tasks. In this paper, to address this issue, three different margin based losses which not only separate classes but also demand a fixed margin between classes are introduced to deep speaker embedding learning. It could be demonstrated that the margin is the key to obtain more discriminative speaker embeddings. Experiments are conducted on two public text independent tasks: VoxCeleb1 and Speaker in The Wild (SITW). The proposed approach can achieve the state-ofthe-art performance, with 25% ∼ 30% equal error rate (EER) reduction on both tasks when compared to strong baselines using cross entropy loss with softmax, obtaining 2.238% EER on VoxCeleb1 test set and 2.761% EER on SITW core-core test set, respectively. Index Terms: speaker recognition, speaker embeddings, angular softmax, additive margin softmax, additive angular margin loss

show abstract

CAPTCHA recognition based on deep convolutional neural network

Wang

Qin

Xiang

et al. 2019

View full text Add to dashboard Cite

Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Xiang¹,

Wang²,

Huang³

et al. 2019

Preprint

View full text Add to dashboard Cite

Binary Deep Neural Networks for Speech Recognition

Xiang

2017

View full text Add to dashboard Cite

Deep neural networks (DNNs) are widely used in most current automatic speech recognition (ASR) systems. To guarantee good recognition performance, DNNs usually require significant computational resources, which limits their application to low-power devices. Thus, it is appealing to reduce the computational cost while keeping the accuracy. In this work, in light of the success in image recognition, binary DNNs are utilized in speech recognition, which can achieve competitive performance and substantial speed up. To our knowledge, this is the first time that binary DNNs have been used in speech recognition. For binary DNNs, network weights and activations are constrained to be binary values, which enables faster matrix multiplication based on bit operations. By exploiting the hardware population count instructions, the proposed binary matrix multiplication can achieve 5 ∼ 7 times speed up compared with highly optimized floating-point matrix multiplication. This results in much faster DNN inference since matrix multiplication is the most computationally expensive operation. Experiments on both TIMIT phone recognition and a 50-hour Switchboard speech recognition show that, binary DNNs can run about 4 times faster than standard DNNs during inference, with roughly 10.0% relative accuracy reduction.

show abstract

Binary neural networks for speech recognition

Xiang

2019

Frontiers Inf Technol Electronic Eng

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xu Xiang

Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

CAPTCHA recognition based on deep convolutional neural network

Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Binary Deep Neural Networks for Speech Recognition

Binary neural networks for speech recognition

Contact Info

Product

Resources

About