In this study, we investigated a convolutional neural network (CNN)-based framework for the estimation of the best-corrected visual acuity (BCVA) from fundus images. First, we collected 53,318 fundus photographs from the Gyeongsang National University Changwon Hospital, where each fundus photograph is categorized into 11 levels by retrospective medical chart review. Then, we designed 4 BCVA estimation schemes using transfer learning with pre-trained ResNet-18 and EfficientNet-B0 models where both regression and classification-based prediction are taken into account. According to the results of the study, the predicted BCVA by CNN-based schemes is close to the actual value such that 94.37% of prediction accuracy can be achieved when 3 levels of difference can be tolerated during prediction. The mean squared error and $$R^2$$
R
2
score were measured as 0.028 and 0.654, respectively. These results indicate that the BCVA can be predicted accurately for extreme cases, i.e., the level of BCVA is close to either 0.0 or 1.0. Moreover, using the Guided Grad-CAM, we confirmed that the macula and the blood vessel surrounding the macula are mainly utilized in the prediction of BCVA, which validates the rationality of the CNN-based BCVA estimation schemes since the same area is also exploited during the retrospective medical chart review. Finally, we applied the t-distributed stochastic neighbor embedding to examine the characteristics of CNN-based BCVA estimation schemes. The developed BCVA estimation schemes can be employed to obtain the objective measurement of BVCA as well as the medical screening of people with poor access to medical care through smartphone-based fundus imaging.