Background: Diabetes mellitus (DM) is a major public health problem worldwide. It involves dysfunction of blood sugar regulation resulting from insulin resistance, inadequate insulin secretion, or excessive glucagon secretion. Methods: This study collated 971,401 drug usage records of 51,009 DM patients. These data include patient identification code, age, gender, outpatient visiting dates, visiting code, medication features (included items, doses, and frequencies of drugs), HbA1c results, and testing time. We apply a random forest (RF) model for feature selection and implement a regression model with the bidirectional long short-term memory (Bi-LSTM) deep learning architecture. Finally, we use the root mean square error (RMSE) as the evaluation index for the prediction model. Results: After data cleaning, the data included 8,729 male and 9,115 female cases. Metformin was the most important feature suggested by the RF model, followed by glimepiride, acarbose, pioglitazone, glibenclamide, gliclazide, repaglinide, nateglinide, sitagliptin, and vildagliptin. The model performed better with the past two seasons in the training data than with additional seasons. Further, the Bi-LSTM architecture model performed better than support vector machines (SVMs).
Discussion & Conclusion:This study found that Bi-LSTM models is a well kernel in a CDSS which help physicians' decision-making, and the increasing the number of seasons will negative impact the performance. In addition, this study found that the most important drug is metformin, which is recommended as first-line treatment OHA in various situations for DM patients.