“…In fact, many state‐of‐the‐art algorithms search for a weighted combination of simpler rules (Germain et al, ): bagging (Breiman, , ), boosting (Schapire et al, ; Schapire & Singer, ), and Bayesian approaches (Gelman et al, ) or even Kernel methods (Vapnik, ) and neural networks (Bishop, ). The major open problem in this scenario is how to weight the different rules in order to obtain good performance (Berend & Kontorovitch, ; Catoni, ; Lever et al, , ; Nitzan & Paroush, ; Parrado‐Hernández et al, ), how these performances can be assessed (Catoni, ; Donsker & Varadhan, ; Germain et al, , ; Lacasse et al, ; Langford & Seeger, ; Laviolette & Marchand, , ; Lever et al, , ; London et al, ; McAllester, , , ; Shawe‐Taylor & Williamson, ; Tolstikhin & Seldin, ; Van Erven, ), and how this theoretical framework can be exploited for deriving new learning approaches or for applying it in other contexts (Audibert, ; Audibert & Bousquet, ; Bégin et al, ; Germain et al, ; McAllester, ; Morvant, ; Ralaivola et al, ; Roy et al, ; Seeger, , ; Seldin et al, , ; Seldin & Tishby, , ; Shawe‐Taylor & Langford, ). The PAC‐Bayes approach is one of the sharpest analysis frameworks in this context, since it can provide tight bounds on the risk of the Gibbs classifier (GC), also called randomized (or probabilistic) classifier, and the Bayes classifier (BC), also called weighted majority vote classifier (Germain et al, ).…”