Eighty percent of current world energy consumption is satisfied by subsurface resources. In future, billions of watts of electrical power will be generated from geothermal energy sources. Subsurface earth can store the energy produced from renewable sources, such as wind and solar, and could provide safe storage of contaminants and hazardous nuclear waste. Engineering of subsurface earth is critical for fossil energy production, geothermal energy production, and carbon geo-sequestration. However, the subsurface is opaque, inaccessible, and heterogenous with nano-scale to kilo-scale processes that limits our understanding (characterization) and constrains the efficient engineering of the complex subsurface earth resources. There has been rapid increase in sensor deployment, data acquisition, data storage, and data processing for purposes of better engineering and characterizing the subsurface earth. This has promoted large-scale deployment of datadriven methods, machine learning, and data analytics workflows. Subsurface data ranges from nano-scale to kilometer-scale passive as well as active measurements. Subsurface data is acquired in the form of physical fluid/solid samples, images, 3D scans, time-series data, waveforms, and depth-based multi-modal signals, to name a few. Subsurface data sense various physical phenomena, e.g., transport, chemical, mechanical, electrical, and thermal properties. Integration of such varied data sources being acquired at varying scales, rates, resolutions, and volumes mandates robust machine learning methods to better characterize and engineer the subsurface earth. Machine learning has shown to improve the efficiency, efficacy, and productivity of subsurface engineering and characterization efforts required to produce fossil/geothermal energy and to sequester carbon dioxide.