“…A well established approach in such cases (Vovk, 1990;Littlestone & Warmuth, 1994;Littlestone, 1989;Feder, Merhav, & Gutman, 1992;Merhav & Feder, 1993;Cesa-Bianchi et al, 1997;Cesa-Bianchi et al, 1996) is to assume nothing about the (x t , y t ) pairs, and instead, for a given F, to give bounds on the number of mistakes made by a given learning algorithm in terms of the minimum over f ∈ F of the number η of trials t for which f (x t ) = y t . Learning models like this are often referred to as agnostic learning models 4 (Kearns, Schapire, & Sellie, 1994).…”