Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today. It is now becoming impractical for humans to analyze all the data manually. Therefore, it is imperative to train machines to analyze and interpret (eventually) such data as intelligently as humans but far more efficiently in quantity. Despite the recent impressive progress in applications of data science to plasma science and technology, the emerging field of DDPS is still in its infancy. Fueled by some of the most challenging problems such as fusion energy, plasmaprocessing of materials, and fundamental understanding of the universe through observable plasma phenomena, it is expected that DDPS continues to benefit significantly from the interdisciplinary marriage between plasma science and data science into the foreseeable future.
CONTENTSI. Introduction 6 II. Fundamental Data Science 6 A. Introduction 6 B. Data Reduction/Compression 7 C. Dimensional reduction and sparse modeling 8 D. ML-enhanced modeling and simulation 9 E. ML Hardware and integration with models 10 F. Workflow Automation 11 G. Uncertainty Quantification 12 H. Visualization and Data Understanding 13 I. ML Control Theory 15 III. Basic Plasma Physics and Laboratory Experiments A. Introduction B. Spectroscopy, imaging and tomography C. Sparse measurement and noise D. Synthetic instruments and data E. Experimental data visualization F. High-rep rate laser experiments G. Charged particle beams H. Control and Optimisation of Plasma Accelerator Experiments I. Dusty and complex plasmas J. Physics and machine learning K. Challenges and outlook 5 IV. Magnetic Confinement Fusion 36 A. Introduction 36 B. Data-Driven Physics Models 36 C. Optimizing experimental workflows with data-driven methods 37 D. Diagnostics and Fusion Data Streams 38 E. Prediction of Tokamak Disruption 39 F. Surrogate models of fusion plasma 40 G. Magnetic Fusion Energy Data Challenges and Solutions 41 H. Data Science for extreme scale simulation 43 I. Challenges and outlook 44 V. Inertial confinement fusion and high-energy-density physics 45 A. Introduction 45 B. Representation learning for multimodal data 45 C. Transfer learning for simulation and experimen 47 D. Uncertainty quantification and Bayesian inference 47 E. High-performance computing and simulation acceleration 49 F. Design exploration and optimization 49 G. Self-driving experimental facilities 51 H. Challenges and outlook 52 VI. Space and astronomical plasmas 52 A. Introduction 52 B. Space and ground instruments 53 C. Space weather prediction 55 D. Transfer learning to improve historic data 56 E. Surrogate models of fluid closures using machine learning 56 F. Magnetic reconnection 58 G. Challenges and outlook 60 VII. Plasma tec...