Aggregated predictors are obtained by making a set of basic predictors vote according to some weights, that is, to some probability distribution.Randomized predictors are obtained by sampling in a set of basic predictors, according to some prescribed probability distribution.Thus, aggregated and randomized predictors have in common that they are not defined by a minimization problem, but by a probability distribution on the set of predictors. In statistical learning theory, there is a set of tools designed to understand the generalization ability of such procedures: PAC-Bayesian or PAC-Bayes bounds.Since the original PAC-Bayes bounds [163,124], these tools have been considerably improved in many directions (we will for example describe a simplified version of the localization technique of [39,41] that was missed by the community, and later rediscovered as "mutual information bounds"). Very recently, PAC-Bayes bounds received a considerable attention: for example there was workshop on PAC-Bayes at NIPS 2017, (Almost) 50 Shades of Bayesian Learning: PAC-Bayesian trends and insights, organized by B. Guedj, F. Bach and P. Germain. One of the reasons of this recent success is the successful application of these bounds to neural networks [65].An elementary introduction to PAC-Bayes theory is still missing. This is an attempt to provide such an introduction. This is a preliminary version. If you find any typo/mistake, if you think your work is not cited and should be, please let me know, and I will update the tutorial accordingly. Since 1st version: fixed (minor) problems in Theorem 4.5, in Lemma 4.6 and in Subsection 6.5.2, fixed many typos (including some in the proof of Theorem 4.3), included ref. [26,131,114].1 I don't want to scare the reader with measurability conditions, as I will not check them in this tutorial anyway. Here, the exact condition to ensure that what follows is well defined is that for any A ∈ T , the function ((x 1 , y 1 ), . . . , (x n , y n )) → [ρ((x 1 , y 1 ), . . . , (x n , y n ))] (A) is measurable. That is, ρ is a regular conditional probability.2 See the title of van Erven's tutorial [175]: "PAC-Bayes mini-tutorial: a continuous union bound". Note, however, that it is argued by Catoni in [41] that PAC-Bayes bounds are actually more than that, we will come back to this in Section 4.