“…To further improve robustness to varying styles or tasks, unsupervised test-set adaptation, for example, to a particular broadcast show, may be used (Della Pietra et al, 1992;Bulyko et al, 2012Bulyko et al, , 2007Federico, 1999Federico, , 2003Gildea and Hofmann, 1999;Chen et al, 2001;Mrva andWoodland, 2004, 2006;Chien et al, 2005;Tam and Schultz, 2005;Liu et al, 2007Liu et al, , 2008Liu et al, , 2009Liu et al, , 2010. As directly adapting n-gram word probabilities is impractical on limited amounts of data, standard adaptation schemes only involve updating one single, context independent interpolation weight for the component models (Iyer et al, 1994;Rosenfeld, 1996;Clarkson and Robinson, 1997;Seymore and Rosenfeld, 1997;Iyer and Ostendorf, 1999;Mrva and Woodland, 2006). …”