BACKGROUND
Infections incidence in people with type 1 diabetes often makes self-management problematic, i.e. difficulties in controlling blood glucose (BG) levels. During the course of infections, the body demands more energy in order to supply the active tissues in the immune response. Thus, alteration in carbohydrate metabolism is expected to keep up the body’s demand by enhancing glucose uptake and utilization, increasing glucose production, increasing insulin resistance and others. Consequently, despite consuming regular meals, any ingested carbohydrate might cause significant increase in BG levels and often takes longer time to settle down as compared to the regular/normal day. It is common to observe prolonged hyperglycemia episodes, and frequent insulin injections. Patients have to struggle with enhanced and frequent insulin injections so as to lower the abnormal BG episode. This kind of event (BG anomalies) presents an enormous opportunity for automatically detecting infection incidence using self-recorded data, and thereby detecting infectious disease outbreak if properly detected with a dedicated algorithm. Moreover, it can also enable to provide a personalized decision support and learning platform for individuals, family and caregivers. During the course of infection, information regarding BG evolution, alterations in insulin sensitivity, shift incurred in ratio of insulin to carbohydrate, which is a change in amount of insulin needed for every gram of carbohydrate consumed, could be vital. Despite these potential, there has been very limited study that focused on detecting infection incidences in an individual with type 1 diabetes using a dedicated personalized algorithm.
OBJECTIVE
The study aims to develop an algorithm, i.e. a personalized health model, which can automatically detect the incidence of infection in people with type 1 diabetes using self-recorded BG levels, diet intake (carbohydrate in grams) and insulin information as indicator variables. The model is expected to detect deviations from the norm due to infections incidences considering elevated BG level (hyperglycemia incidences), coupled with unusual change in insulin to carbohydrate ratio (frequent insulin injections and unusual reduction in carbohydrate intakes).
METHODS
Method: Semi-supervised models, i.e. one-class classifiers, were trained and tested to detect incidence of infection in people with type 1 diabetes. Three group of one-class classifiers were trained on regular/normal day measurements (target datasets) and tested on dataset containing both the target (regular days) and non-target (infection days); boundary and domain-based, density-based, and reconstruction-based method. The boundary and domain-based method includes one-class support vector machine (v-SVM), minimum spanning tree (MST), support vector data description (SVDD), nearest neighbor (NN), and incremental svm (incSVM). Density-based method includes Parzen, Naïve Parzen, normal Gaussian, mixture of Gaussian (MOG), minimum covariance Gaussian (MCG), k-nearest neighbor (KNN), and local outlier factor (LOF). The reconstruction-based method includes Auto-encoder network, self-organizing map (SOM), K-means, and principal component analysis (PCA). For comparison purposes, two unsupervised models were also tested; local outlier factor (LOF) and connectivity-based outlier factor (COF). The one-class classifiers were evaluated based on twenty times 5-fold stratified cross validation. Area under the ROC curve (AUC), sensitivity, and F1-score were taken into consideration for measuring the models performance. The models were compared on two groups of data; raw data and filtered data (with a moving average filter of 2-days). Generally, the models were compared based on their detection performance, complexity, computational time, and number of samples required.
Materials: A high precision self-recorded data of ten patient years collected from 3 real subjects (2 males and 1 females with average age of 34 (13.2) years) with type 1 diabetes were used. The datasets consist of BG measurement and continuous glucose monitor (CGM), injected insulin (basal and bolus), diet (carbohydrate in grams), and self-reported events of acute infection. It is costly and time consuming to collect such a rich and large dataset from a lot of participants, if not impossible. The patients have used different diabetes self-management technologies to gather these datasets including Diabetes Diary, Spike, Dexcom CGM, insulin Pens and pumps. The datasets are consisted of regular/normal years without infection incidences and years with at least one or more acute infection incidences. The regular/normal patient years are used, as baseline data, to compare the effect of all patient controllable parameters and patient uncontrollable parameters during the incidence of infection. The self-reported incidences of acute infections are a case of influenza (flu), and mild and light common cold without fever. All the experiments were conducted using MATLAB® 2018b (Mathworks, Inc, Natwick, MA).
RESULTS
The analysis of self-recorded data of ten patient years reveals that BG levels and insulin to carbohydrate ratio are highly affected by the incidence of infection as compared to the regular/normal days. Semi-supervised and unsupervised models trained and tested using bivariate input, BG levels and insulin to carbohydrate ratio, achieved an excellent performance in describing the dataset, i.e. detecting the infection days from the regular/normal days. However, the unsupervised methods suffer in performance degradation as compared to the one-class classifier mainly because of the atypical nature of the data, not distributed uniformly, where some regions contain high density and other are sparse. In regard to the one-class classifiers, the boundary and domain-based method produced better description of the data as compared to the density and reconstruction-based methods mainly because of the atypicality of the data. Regarding the computational time, NN, SVDD, and SOM took considerable training time, which typically grows as the samples size increases. As for the models testing time, only LOF and COF took considerable time.
CONCLUSIONS
We demonstrated the applicability of semi-supervised and unsupervised models for the detection of infection incidences in people with type 1 diabetes. Detecting the incidence of infection in these patient group can provide an opportunity to devise tailored services, i.e. a personalized decision support and a learning platform for the individuals, and simultaneously can be used for detecting potential public health threats, i.e. infectious disease outbreak, on a large scale through a spatio-temporal cluster detection. In general, the proposed approaches achieved excellent performance, and in particular the boundary and domain-based method performed better. In contrast to the particular models, v-SVM, K-NN, and K-means achieved better performance in all the infection cases. Altogether, we foresee that the presented result could encourage researchers to examine beyond the presented features into other additional features of the self-recorded data, e.g. various CGM feature and physical activity data, on a large scale basis.