A common goal of scientific disciplines is to understand the relationships between observable quantities and to construct models that encode such relationships. Eventually any model, and its supporting hypothesis, needs to be tested against observations-the celebrated Popper's falsifiability criterion (Popper, 1959). Hence, experiments, measurements, and observationsin one word data-have always played a pivotal role in science, at least since the time of Galileo's experiment dropping objects from the leaning tower of Pisa.Yet, it is only in the last decade that libraries' bookshelves have started to pile up with books about the data revolution, big data, data science, and various modifications of these terms. While there is certainly a tendency both in science and publishing to re-brand old ideas and to inflate buzzwords, one cannot deny that the unprecedented large amount of collected data of any sort-be it customer buying preferences, health and genetic records, high energy particle collisions, supercomputer simulation results, or of course, space weather data-makes the time we are living in unique in history. The discipline that benefits the most from the explosion of the data revolution is certainly machine learning. This field is traditionally seen as a subset of artificial intelligence, although its boundaries and definition are somehow blurry. For the purposes of this book, we broadly refer to machine learning as the set of methods and algorithms that can be used for the following problems: (1) make predictions in time or space of a continuous quantity (regression); (2) assign a datum to a class within a prespecified set (classification); (3) assign a datum to a class within a set that is determined by the algorithm itself (clustering); (4) reduce the dimensionality of a dataset, by exposing relationships among variables; and (5) establish linear and nonlinear relationships and causalities among variables.Machine learning is in its golden age today for the simple reason that methods, algorithms, and tools, studied and designed during the last two decades (and sometimes forgotten), have started to produce unexpectedly good results in the last 5 years, exploiting the historically unique combination of big data availability and cheap computing power.The single methodology that has been popularized the most by nonspecialist media as the archetype of machine learning's groundbreaking promise is probably the massive multilayer neural network, which is often referred to as deep learning (LeCun et al., 2015). For instance, deep learning is the technology behind the recent successes in image and speech recognition (with the former recently achieving better-than-human accuracy; He et al., 2015) and the first computer ever defeating a world champion in the game of Go (Silver, 2016).The popular media often focus on the technological applications of machine learning, which has propelled recent advances in many areas, such as self-driving cars, online fraud detection, personalized advertisement and recommendation, real-...