Modern statistical software and machine learning libraries are enabling semi-automated statistical inference. Within this context, it appears easier and easier to try and fit many models to the data at hand, reversing thereby the Fisherian way of conducting science by collecting data after the scientific hypothesis (and hence the model) has been determined. The renewed goal of the statistician becomes to help the practitioner choose within such large and heterogeneous families of models, a task known as model selection. The Bayesian paradigm offers a systematized way of assessing this problem. This approach, launched by Harold Jeffreys in his 1935 book Theory of Probability, has witnessed a remarkable evolution in the last decades, that has brought about several new theoretical and methodological advances. Some of these recent developments are the focus of this survey, which tries to present a unifying perspective on work carried out by different communities. In particular, we focus on non-asymptotic out-of-sample performance of Bayesian model selection and averaging techniques, and draw connections with penalized maximum likelihood. We also describe recent extensions to wider classes of probabilistic frameworks including high-dimensional, unidentifiable, or likelihood-free models.