SummaryBackgroundAs deep learning becomes increasingly accessible for automated detection of diabetic retinopathy (DR), questions persist regarding its performance equity among diverse identity groups. We aimed to explore the fairness of current deep learning models and further create a more equitable model designed to minimize disparities in performance across groups.MethodsThis study used one proprietary and two publicly available datasets, containing two-dimensional (2D) wide-angle color fundus, scanning laser ophthalmoscopy (SLO) fundus, and three-dimensional (3D) Optical Coherence Tomography (OCT) B-Scans, to assess deep learning models for DR detection. We developed a fair adaptive scaling (FAS) module that dynamically adjusts the significance of samples during model training for DR detection, aiming to lessen performance disparities across varied identity groups. FAS was incorporated into both 2D and 3D deep learning models to facilitate the binary classification of DR and non-DR cases. The area under the receiver operating characteristic curve (AUC) was adopted to measure the model performance. Additionally, we devised an equity-scaled AUC metric that evaluates model fairness by balancing overall AUC against disparities among groups.FindingsUsing in-house color fundus on the racial attribute, the overall AUC and ES-AUC of EfficientNet after integrating with FAS improved from 0.88 and 0.83 to 0.90 and 0.84 (p < 0.05), where the AUCs for Asians and Whites improved by 0.04 and 0.03, respectively (p < 0.01). On gender, the overall AUC and ES-AUC of EfficientNet after integrating with FAS both improved by 0.01 (p < 0.05). While using in-house SLO fundus on race, the overall AUC and ES-AUC of EfficientNet after integrating FAS improved from 0.80 to 0.83 (p < 0.01), where the AUCs for Asians, Blacks, and Whites improved by 0.02, 0.01 and 0.04, respectively (p < 0.05). On gender, FAS improved EfficientNet’s overall AUC and ES-AUC both by 0.02, where the same improvement of 0.02 (p < 0.01) was gained for Females and Males. Using 3D deep learning model DenseNet121 on in-house OCT-B-Scans on race, FAS improved the overall AUC and ES-AUC from 0.875 and 0.81 to 0.884 and 0.82 respectively, where the AUCs for Asians and Blacks improved by 0.03 and 0.02 (p < 0.01). On gender, FAS improved the overall AUC and ES-AUC of DenseNet121 by 0.04 and 0.03, whereas the AUCs for Females and Males improved by 0.05 and 0.04 (p < 0.01), respectively.InterpretationExisting deep learning models indeed exhibit variable performance across diverse identity groups in DR detection. The FAS proves beneficial in enhancing model equity and boosting DR detection accuracy, particularly for underrepresented groups.