“…We used seven different training configurations to test the effect of varying numbers of parameters. Specifically, we trained models with 2,4,8,12,16,20, and 24 hidden units or gaussians. Each autoencoder was initialized with random weights between -0.05 and 0.05 and trained on the whitened features for 100 iterations using a batch-mode backpropagation algorithm with an adaptive learning rate initialized at 0.05, a momentum term of 0.045, and batch shuffling.…”