In neuroimaging studies, regression models are frequently used to identify the association of the imaging features and clinical outcome, where the number of imaging features (e.g., hundreds of thousands of voxel-level predictors) much outweighs the number of subjects in the studies. Classical best subset selection or penalized variable selection methods that perform well for low-or moderate-dimensional data do not scale to ultrahigh-dimensional neuroimaging data. To reduce the dimensionality, variable screening has emerged as a powerful tool for feature selection in neuroimaging studies. We present a selective review of the recent developments in ultrahigh-dimensional variable screening, with a focus on their practical performance on the analysis of neuroimaging data with complex spatial correlation structures and high-dimensionality. We conduct extensive simulation studies to compare the performance on selection accuracy and computational costs between the different methods. We present analyses of resting-state functional magnetic resonance imaging data in the Autism Brain Imaging Data Exchange study. correlated covariates, imaging data analysis, linear regression, variable screening 1 | INTRODUCTION Recent advances in neuroimaging technology have generated high-resolution brain imaging data that can measure brain functions and structures with increasing accuracy. This provides unprecedented opportunities for researchers to precisely identify the important brain regions that are strongly associated with certain clinical symptoms, which will have a great impact on public health and precision medicine. To this end, a class of regression models has been widely used, where the response variable is the clinical outcome of interest and the predictors include imaging features. We refer to this model as scalar-on-image regression. Performing variable selection in scalar-on-image regression directly identifies the brain regions of interest. However, in a typical neuroimaging study, the three-dimensional brain image may involve up to millions of voxels, while the number of subjects is usually in a range of hundreds to thousands. Thus, for voxel-level selection in the scalar-on-image regression, the number of predictors is often on the exponential order of the sample size. Due to the computational infeasibility, some classical variable selection methods, such as the best subset selection, are not directly applicable to this setting. Similarly, regularization-based variable selection methods have been extensively studied during the past decades, and have many