Background
Type 1 diabetes mellitus (T1DM) is characterized by chronic insulin deficiency and consequent hyperglycemia. Patients with T1DM require long-term exogenous insulin therapy to regulate blood glucose levels and prevent the long-term complications of the disease. Currently, there are no effective algorithms that consider the unique characteristics of T1DM patients to automatically recommend personalized insulin dosage levels.
Objective
The objective of this study was to develop and validate a general reinforcement learning (RL) framework for the personalized treatment of T1DM using clinical data.
Methods
This research presents a model-free data-driven RL algorithm, namely Q-learning, that recommends insulin doses to regulate the blood glucose level of a T1DM patient, considering his or her state defined by glycated hemoglobin (HbA1c) levels, body mass index, engagement in physical activity, and alcohol usage. In this approach, the RL agent identifies the different states of the patient by exploring the patient’s responses when he or she is subjected to varying insulin doses. On the basis of the result of a treatment action at time step t, the RL agent receives a numeric reward, positive or negative. The reward is calculated as a function of the difference between the actual blood glucose level achieved in response to the insulin dose and the targeted HbA1c level. The RL agent was trained on 10 years of clinical data of patients treated at the Mass General Hospital.
Results
A total of 87 patients were included in the training set. The mean age of these patients was 53 years, 59% (51/87) were male, 86% (75/87) were white, and 47% (41/87) were married. The performance of the RL agent was evaluated on 60 test cases. RL agent–recommended insulin dosage interval includes the actual dose prescribed by the physician in 53 out of 60 cases (53/60, 88%).
Conclusions
This exploratory study demonstrates that an RL algorithm can be used to recommend personalized insulin doses to achieve adequate glycemic control in patients with T1DM. However, further investigation in a larger sample of patients is needed to confirm these findings.
In this paper a reinforcement learning algorithm is applied to regulating the blood glucose level of Type I diabetic patients using insulin pump. In this approach the agent learns from its exploration and experiences to selects its actions. In the current reinforcement learning algorithm, body weight, A1C level, and physical activity define the state of a diabetic patient. For the agent, insulin dose levels constitute the actions. There are five alternative actions for the agent: (1) raising the insulin infusion rate during 24 hours, (2) keeping it the same, (3) decreasing insulin infusion rate, (4) adjusting basal rate two times during 24 hours, and (5) adjusting basal rate three times during 24 hours. As a result of a patient’s treatment, after each time step t, the reinforcement learning agent receives a numerical reward depending on the response of the patient’s health condition. At each stage the reward is calculated as a function of the deviation of the A1C from its target value. Since reinforcement learning algorithm can select actions that improve patient condition by taking into account delayed effects it has tremendous potential to control blood glucose level in diabetic patients. This research will utilize ten years of clinical data obtained from a hospital.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.