The prevention of chronic disease is a long-term combat with continual fine-tuning to adapt to the course of disease. Without comprehensive insights, prescriptions may prioritize short-term gains but deviate from trajectories toward long-term survival. Here we introduce Duramax, a fully evidence-based framework to optimize the dynamic preventive strategy in the long-term. This framework synchronizes reinforcement learning with real-world data modeling, leveraging the diverse treatment trajectories in electronic health records (EHR). In our study, Duramax learned from millions of treatment decisions of lipid-modifying drugs, becoming specialized in cardiovascular disease (CVD) prevention. The extensive volume of implicit knowledge Duramax harnessed far exceeded that of individual clinicians, resulting in superior performance. Specifically, when treatment decisions from clinicians aligned with those suggested by Duramax, a reduction in CVD risk was observed. Moreover, post hoc analysis confirmed that decisions from Duramax were transparent and reasonable. Our research showcases how tailored computational analysis on well-curated EHR can achieve high nuance in personalized disease prevention.