In recent years, Alzheimer’s disease (AD) has been a serious threat to human health. Researchers and clinicians alike encounter a significant obstacle when trying to accurately identify and classify AD stages. Several studies have shown that multimodal neuroimaging input can assist in providing valuable insights into the structural and functional changes in the brain related to AD. Machine learning (ML) algorithms can accurately categorize AD phases by identifying patterns and linkages in multimodal neuroimaging data using powerful computational methods. This study aims to assess the contribution of ML methods to the accurate classification of the stages of AD using multimodal neuroimaging data. A systematic search is carried out in IEEE Xplore, Science Direct/Elsevier, ACM DigitalLibrary, and PubMed databases with forward snowballing performed on Google Scholar. The quantitative analysis used 47 studies. The explainable analysis was performed on the classification algorithm and fusion methods used in the selected studies. The pooled sensitivity and specificity, including diagnostic efficiency, were evaluated by conducting a meta-analysis based on a bivariate model with the hierarchical summary receiver operating characteristics (ROC) curve of multimodal neuroimaging data and ML methods in the classification of AD stages. Wilcoxon signed-rank test is further used to statistically compare the accuracy scores of the existing models. With a 95% confidence interval of 78.87–87.71%, the combined sensitivity for separating participants with mild cognitive impairment (MCI) from healthy control (NC) participants was 83.77%; for separating participants with AD from NC, it was 94.60% (90.76%, 96.89%); for separating participants with progressive MCI (pMCI) from stable MCI (sMCI), it was 80.41% (74.73%, 85.06%). With a 95% confidence interval (78.87%, 87.71%), the Pooled sensitivity for distinguishing mild cognitive impairment (MCI) from healthy control (NC) participants was 83.77%, with a 95% confidence interval (90.76%, 96.89%), the Pooled sensitivity for distinguishing AD from NC was 94.60%, likewise (MCI) from healthy control (NC) participants was 83.77% progressive MCI (pMCI) from stable MCI (sMCI) was 80.41% (74.73%, 85.06%), and early MCI (EMCI) from NC was 86.63% (82.43%, 89.95%). Pooled specificity for differentiating MCI from NC was 79.16% (70.97%, 87.71%), AD from NC was 93.49% (91.60%, 94.90%), pMCI from sMCI was 81.44% (76.32%, 85.66%), and EMCI from NC was 85.68% (81.62%, 88.96%). The Wilcoxon signed rank test showed a low P-value across all the classification tasks. Multimodal neuroimaging data with ML is a promising future in classifying the stages of AD but more research is required to increase the validity of its application in clinical practice.