Abstract-Adaptive training is a powerful approach for building speech recognition systems on non-homogeneous training data. Recently approaches based on predictive model-based compensation schemes, such as Joint Uncertainty Decoding (JUD) and Vector Taylor Series (VTS), have been proposed. This paper reviews these model-based compensation schemes and relates them to factor-analysis style systems. Forms of Maximum Likelihood (ML) adaptive training with these approaches are described, based on both second-order optimisation schemes and Expectation Maximisation (EM). However, discriminative training is used in many state-of-the-art speech recognition. Hence, this paper proposes discriminative adaptive training with predictive model-compensation approaches for noise robust speech recognition. This training approach is applied to both JUD and VTS compensation with minimum phone error training. A large scale multi-environment training configuration is used and the systems evaluated on a range of in-car collected data tasks.I. INTRODUCTION Speech recognition in noisy environments is a difficult task. There have been a range of solutions proposed to this problem [1], [2], [3]. One approach that has been successfully applied in a range of tasks is model-based compensation. These schemes include Vector Taylor Series (VTS), [1], and Joint Uncertainty Decoding (JUD), [4]. Here a "clean" acoustic model is adapted to be representative of a model trained in the target acoustic condition. To apply these approaches it is necessary to estimate a background noise model and, using a mismatch function that represents the impact of noise on the clean speech, combine this with the clean model parameters. Schemes have previously been proposed that allow the noise model parameters to be robustly trained using Maximum Likelihood (ML) estimation [3], [5] and will not be discussed in this paper. However, it is not always possible to directly train clean model parameters. Current state-of-the-art speech recognition systems are trained on large amounts of speech data, typically collected in a range of acoustic conditions. It is also expected that collecting data in conditions related to the actual target application conditions should improve performance.To address this problem adaptive training has been proposed [5], [6], [7], [8]. These schemes make use of adaptation/compensation transformations during training. The aim is to train a canonical model removing any dependence on speaker or noise condition. Originally proposed to account for the differences between speakers in the training data [6], these schemes have more recently been applied to situations where there are large variations in the background noise [5], [7]. Model-based compensation schemes have been used [5],