“…The KWS results are produced for six different ASR systems: (1) GMM, the baseline GMM/HMM system which is a discriminatively trained, speaker-adaptively trained acoustic model; (2) BSRS, a Bootstrap and restructuring model [20] in which the original training data is randomly re-sampled to produce multiple subsets and the resulting models are aggregated at the state level to produce a large, composite model; (3) CU-HTK, a TANDEM HMM system from Cambridge University using cross-word, stateclustered, triphone models trained with MPE, fMPE, and speakeradaptive training. For efficiency, the MLP features were incorporated in the same fashion as [21]; (4) MLP, a multi-layer perceptron model [22] which is a GMM-based ASR system that uses neuralnetwork features; (5) NN-GMM, a speaker-adaptively and discriminatively trained GMM/HMM system from RWTH Aachen University using bottle-neck neural network features [23] and a 4-gram Kneser-Ney LM with optimized discounting parameters [24] using a modified version of the RWTH open source decoder [25]; and (6) DBN, a deep belief network hybrid model [26,27] with discriminative pertraining, frame-level cross-entropy training and state-level minimum Bayes risk sequence training. GMM, BSRS, DBN and MLP models are built with the IBM Attila toolkit [28].…”