Macular edema (ME) is a retinal condition in which central vision of a patient is affected. ME leads to accumulation of fluid in the surrounding macular region resulting in a swollen macula. Optical coherence tomography (OCT) and the fundus photography are the two widely used retinal examination techniques that can effectively detect ME. Many researchers have utilized retinal fundus and OCT imaging for detecting ME. However, to the best of our knowledge, no work is found in the literature that fuses the findings from both retinal imaging modalities for the effective and more reliable diagnosis of ME. In this paper, we proposed an automated framework for the classification of ME and healthy eyes using retinal fundus and OCT scans. The proposed framework is based on deep ensemble learning where the input fundus and OCT scans are recognized through the deep convolutional neural network (CNN) and are processed accordingly. The processed scans are further passed to the second layer of the deep CNN model, which extracts the required feature descriptors from both images. The extracted descriptors are then concatenated together and are passed to the supervised hybrid classifier made through the ensemble of the artificial neural networks, support vector machines and naïve Bayes. The proposed framework has been trained on 73,791 retinal scans and is validated on 5100 scans of publicly available Zhang dataset and Rabbani dataset. The proposed framework achieved the accuracy of 94.33% for diagnosing ME and healthy subjects and achieved the mean dice coefficient of 0.9019 ± 0.04 for accurately extracting the retinal fluids, 0.7069 ± 0.11 for accurately extracting hard exudates and 0.8203 ± 0.03 for accurately extracting retinal blood vessels against the clinical markings.