Background
Lyme disease is the most prevalent vector-borne disease in the United States, yet its host factors are poorly understood and diagnostic tests are limited. We evaluated patients in a large health system to uncover the role of cholesterol in the susceptibility, severity, and machine learning-based diagnosis of Lyme disease.
Methods
A longitudinal health system cohort comprised 1,019,175 individuals with electronic health record data and 50,329 with linked genetic data. Associations of blood cholesterol level, a cholesterol genetic score comprising common genetic variants, and burden of rare loss-of-function (LoF) variants in cholesterol metabolism genes with Lyme disease were investigated. A portable machine learning model was constructed and tested to predict Lyme disease using routine lipid and clinical measurements.
Results
There were 3,832 cases of Lyme disease. Increasing cholesterol was associated with greater risk of Lyme disease and hypercholesterolemia was more prevalent in Lyme disease cases than controls. Cholesterol genetic scores and rare LoF variants in CD36 and LDLR were associated with elevated Lyme disease risk. Serological profiling of cases revealed parallel trajectories of rising cholesterol and immunoglobulin levels over the disease course, including marked increases in individuals with LoF variants and high cholesterol genetic scores. The machine learning model predicted Lyme disease solely using routine lipid panel, blood count, and metabolic measurements.
Conclusions
These results demonstrate the value of large-scale genetic and clinical data to reveal host factors underlying infectious disease biology, risk, and prognosis, and the potential for their clinical translation to machine learning diagnostics that do not need specialized assays.