Contrastive auto-encoder for phoneme recognition

Zheng, Xin; Wu, Zhiyong; Meng, Helen; Cai, Lianhong

doi:10.1109/icassp.2014.6854056

Cited by 9 publications

(9 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We also introduce an improved cAE architecture and training method that reduces the number of hyperparameters to be tuned, and show that narrow architectures work better, with reduced error rates on a zero-resource language after tuning on English. Unlike similar previous work [9,10,11] our cAE-based systems are fully unsupervised, train on individual frames without context, and use a loss function in the input vector space instead of the representation vector space.…”

Section: Introductionmentioning

confidence: 99%

A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge

et al. 2015

View full text Add to dashboard Cite

The success of supervised deep neural networks (DNNs) in speech recognition cannot be transferred to zero-resource languages where the requisite transcriptions are unavailable. We investigate unsupervised neural network based methods for learning frame-level representations. Good frame representations eliminate differences in accent, gender, channel characteristics, and other factors to model subword units for within-and acrossspeaker phonetic discrimination. We enhance the correspondence autoencoder (cAE) and show that it can transform Mel Frequency Cepstral Coefficients (MFCCs) into more effective frame representations given a set of matched word pairs from an unsupervised term discovery (UTD) system. The cAE combines the feature extraction power of autoencoders with the weak supervision signal from UTD pairs to better approximate the extrinsic task's objective during training. We use the Zero Resource Speech Challenge's minimal triphone pair ABX discrimination task to evaluate our methods. Optimizing a cAE architecture on English and applying it to a zero-resource language, Xitsonga, we obtain a relative error rate reduction of 35% compared to the original MFCCs. We also show that Xitsonga frame representations extracted from the bottleneck layer of a supervised DNN trained on English can be further enhanced by the cAE, yielding a relative error rate reduction of 39%.

show abstract

Section: Introductionmentioning

confidence: 99%

A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge

et al. 2015

View full text Add to dashboard Cite

show abstract

“…No Rifai et al (2011a) Higher order Contractive autoencoder: CAE + Hessian of the output wrt the input. No Zheng et al (2014) Contrastive autoencoder: A term to reduce the intra-class variations between the learned representation of samples belonging to the same class is added at the final layer.…”

Section: Nomentioning

confidence: 99%

“…Contrastive Autoencoder (CsAE) proposed by Zheng et al (2014), is another variant of supervised autoencoder which uses the class label information during training. The loss function of the model is the difference between the output of two sub-autoencoders trained simultaneously on samples belonging to the same class, along with the loss function of each subautoencoder.…”

Section: Nomentioning

confidence: 99%

Are you eligible? Predicting adulthood from face images via Class Specific Mean Autoencoder

Singh

Nagpal

Vatsa

et al. 2019

Pattern Recognition Letters

View full text Add to dashboard Cite

Predicting if a person is an adult or a minor has several applications such as inspecting underage driving, preventing purchase of alcohol and tobacco by minors, and granting restricted access. The challenging nature of this problem arises due to the complex and unique physiological changes that are observed with age progression. This paper presents a novel deep learning based formulation, termed as Class Specific Mean Autoencoder, to learn the intra-class similarity and extract class-specific features. We propose that the feature of a particular class if brought similar/closer to the mean feature of that class can help in learning class-specific representations. The proposed formulation is applied for the task of adulthood classification which predicts whether the given face image is of an adult or not. Experiments are performed on two large databases and the results show that the proposed algorithm yields higher classification accuracy compared to existing algorithms and a Commercial-Off-The-Shelf system.

show abstract

“…Recently, supervised extensions of traditional unsupervised model of the autoencoder have also been proposed [17], [18], [19], [20]. Most of these algorithms incorporate class information at the time of feature extraction with an aim to reduce only the intra-class variations.…”

Section: B Proposed Class Representative Autoencodermentioning

confidence: 99%

Class representative autoencoder for low resolution multi-spectral gender classification

Singh

Nagpal

Singh

et al. 2017

2017 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

Gender is one of the most common attributes used to describe an individual. It is used in multiple domains such as human computer interaction, marketing, security, and demographic reports. Research has been performed to automate the task of gender recognition in constrained environment using face images, however, limited attention has been given to gender classification in unconstrained scenarios. This work attempts to address the challenging problem of gender classification in multi-spectral low resolution face images. We propose a robust Class Representative Autoencoder model, termed as AutoGen for the same. The proposed model aims to minimize the intraclass variations while maximizing the inter-class variations for the learned feature representations. Results on visible as well as near infrared spectrum data for different resolutions and multiple databases depict the efficacy of the proposed model. Comparative results with existing approaches and two commercial off-the-shelf systems further motivate the use of class representative features for classification.

show abstract

Contrastive auto-encoder for phoneme recognition

Cited by 9 publications

References 9 publications

A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge

A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge

Are you eligible? Predicting adulthood from face images via Class Specific Mean Autoencoder

Class representative autoencoder for low resolution multi-spectral gender classification

Contact Info

Product

Resources

About