The research suggests a unique ensemble learning approach for precise feature extraction and feature fusion from multi-modal medical pictures, which may be applied to the diagnosis of uncommon neurological illnesses. The proposed method makes use of the combined characteristics of Convolutional Neural Networks and Generative Adversarial Networks (CNN-GAN) to improve diagnostic accuracy and enable early identification. In order to do this, a diverse dataset of multi-modal patient medical records with rare neurological disorders was gathered. The multi-modal pictures are successfully combined using a GAN-based image-to-image translation technique to produce fake images that effectively gather crucial clinical data from different paradigms. To extract features from extensive clinical imaging databases, the research employs trained models using transfer learning approaches with CNN frameworks designed specifically for analyzing medical images. By compiling unique traits from each modality, a thorough grasp of the core pathophysiology is produced. By combining the strengths of several CNN algorithms using ensemble learning techniques including voting by majority, weight averaging, and layering, the forecasts were also integrated to arrive at the final diagnosis. In addition, the ensemble approach enhances the robustness and reliability of the assessment algorithm, resulting in increased effectiveness in identifying unusual neurological conditions. The analysis of the collected data shows that the proposed technique outperforms single-modal designs, demonstrating the importance of multimodal fusion of pictures and feature extraction. The proposed method significantly outperforms existing methods, achieving an accuracy of 99.99%, as opposed to 85.69% for XGBoost and 96.12% for LSTM. The proposed method significantly outperforms existing methods, achieving an average increase in accuracy of approximately 13.3%. The proposed method was implemented using Python software.