The story of statistics in geotechnical engineering can be traced to Lumb's classical Canadian Geotechnical Journal paper on "The Variability of Natural Soils" published in 1966. In parallel, the story of risk management in geotechnical engineering has progressed from design by prescriptive measures that do not require site-specific data, to more refined estimation of site-specific response using limited data from site investigation as inputs to physical models, to quantitative risk assessment (QRA) requiring considerable data at regional/national scales. In an era where data is recognised as the "new oil", it makes sense for us to lean towards decision making strategies that are more responsive to data, particularly if we have zettabytes coming our way. In fact, we already have a lot of data, but the vast majority is shelved after a project is completed ("dark data"). It does not make sense to reduce one zettabyte to a few bytes describing a single cautious value. It does not make sense to expect big data to be precise and to fit a particular favourite physical model as demanded by the classical deterministic world view. This paper advocates the position that there is value in data of any kind (good or not so good quality, or right or wrong fit to a physical model) and the challenge is for the new generation of researchers to uncover this value by hearing what data have to say for themselves, be it using probabilistic, machine learning, or other data-driven methods including those informed by physics and human experience, and to re-imagine the role of the geotechnical engineer in an immersive environment likely to be imbued by machine intelligence.