Accurate evaluation of retinopathy of prematurity (ROP) severity is vital for screening and proper treatment. Current deep-learning-based automated AI systems for assessing ROP severity do not follow clinical guidelines and are opaque. The aim of this study is to develop an interpretable AI system by mimicking the clinical screening process to determine ROP severity level. A total of 6100 RetCam Ⅲ wide-field digital retinal images were collected from Guangdong Women and Children Hospital at Panyu (PY) and Zhongshan Ophthalmic Center (ZOC). A total of 3330 images of 520 pediatric patients from PY were annotated to train an object detection model to detect lesion type and location. A total of 2770 images of 81 pediatric patients from ZOC were annotated for stage, zone, and the presence of plus disease. Integrating stage, zone, and the presence of plus disease according to clinical guidelines yields ROP severity such that an interpretable AI system was developed to provide the stage from the lesion type, the zone from the lesion location, and the presence of plus disease from a plus disease classification model. The ROP severity was calculated accordingly and compared with the assessment of a human expert. Our method achieved an area under the curve (AUC) of 0.95 (95% confidence interval [CI] 0.90–0.98) in assessing the severity level of ROP. Compared with clinical doctors, our method achieved the highest F1 score value of 0.76 in assessing the severity level of ROP. In conclusion, we developed an interpretable AI system for assessing the severity level of ROP that shows significant potential for use in clinical practice for ROP severity level screening.