In response to the practical demands for a wide field of view (FOV), long exit pupil distance, and high image quality in head-mounted displays, an initial structure was built based on the principles of catadioptric lenses in a folded optical path (pancake optics) and the field curvature correction formula. Subsequently, using even-order aspheric surfaces, a VR optical system was developed with a FOV of 2w=110∘, an exit pupil distance of 15 mm, and an exit pupil diameter of 8 mm. Compared to traditional designs, a human eye model was introduced as the image plane to construct a VR–human eye optical system analysis model, allowing for direct analysis of the retinal image quality under different states of the human eye. After that, based on medical statistical data, the impact of different light intensity conditions on the retinal image quality of the VR optical system was explored. Experimental results showed that changes in light intensity affect the pupil diameter, which, in turn, influences the final image quality of the system. To solve the issues, two solutions were proposed: adjusting the screen distance and changing the lens spacing. Both methods significantly improved the imaging quality on the human retina, with an increase of over 30% in the average MTF value. A comparison of these methods was conducted, providing valuable references for future VR optical system designs.