2013 IEEE Workshop on Automatic Speech Recognition and Understanding 2013
DOI: 10.1109/asru.2013.6707745
|View full text |Cite
|
Sign up to set email alerts
|

Deep maxout neural networks for speech recognition

Abstract: A recently introduced type of neural network called maxout has worked well in many domains. In this paper, we propose to apply maxout for acoustic models in speech recognition. The maxout neuron picks the maximum value within a group of linear pieces as its activation. This nonlinearity is a generalization to the rectified nonlinearity and has the ability to approximate any form of activation functions. We apply maxout networks to the Switchboard phone-call transcription task and evaluate the performances unde… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
27
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 70 publications
(30 citation statements)
references
References 12 publications
3
27
0
Order By: Relevance
“…MNNs, instead of making a prior assumption about parametric form of non-linearity, try to build it automatically from a number of linear components. While this work was under review two additional papers were published on maxout activation for ASR [25,26]. As a result, contributions of this work overlap to some extent with one or the other and we will refer to those in text when necessary.…”
Section: Introductionmentioning
confidence: 99%
“…MNNs, instead of making a prior assumption about parametric form of non-linearity, try to build it automatically from a number of linear components. While this work was under review two additional papers were published on maxout activation for ASR [25,26]. As a result, contributions of this work overlap to some extent with one or the other and we will refer to those in text when necessary.…”
Section: Introductionmentioning
confidence: 99%
“…This activation function can be regarded as a generalization of the rectifier function [16], and so far, only a few studies have attempted to apply maxout networks to speech recognition tasks. These all found that maxout nets slightly outperformed ReLU networks, in particular under lowresource conditions [17][18][19]. Here, we show that the pooling procedure applied in CNNs and the pooling step of the maxout function are practically the same, and hence, it is trivial to combine the two techniques and construct convolutional networks out of maxout neurons.…”
Section: Introductionmentioning
confidence: 92%
“…In our experiments with p-norm pooling, we set p to 2, following Zhang et al, but in our first tests, the group size was set to 2, which was found to be optimal for maxout networks [17][18][19]. Our tests quickly revealed that our pnorm implementation faces difficulties with propagating the error back to lower layers.…”
Section: Improvements To Maxoutmentioning
confidence: 99%
See 1 more Smart Citation
“…For the English model, a DNN with 5 hidden layers and a softmax output layer with 41 units was trained on 700 hours of English data from Switchboard dataset and Fisher dataset. More details about DNN model training can be seen in [11].…”
Section: System Descriptionmentioning
confidence: 99%