Background: Telemonitoring of symptoms and physiological signs has been suggested as a means of early detection of exacerbations of chronic obstructive pulmonary disease (COPD) with a view to instituting timely treatment. However, current algorithms to identify exacerbations result in frequent false positive results and increased workload. Machine learning, when applied to predictive modelling, can determine patterns of risk factors useful for improving quality of predictions.Objective: To establish if machine learning techniques applied to telemonitoring datasets improve prediction of hospital admissions, decisions to start steroids, and to determine if the addition of weather data further improves such predictions.
Methods:We used daily symptoms, physiological measures and medication data, with baseline demography, COPD severity, quality of life, and hospital admissions from a pilot and large randomised controlled trial of telemonitoring in COPD. In addition, we linked weather data from the UK Meteorological Office. We used feature selection and extraction techniques for time-series to construct up to 153 predictive patterns (features) from symptom, medication, and physiological measurements. The resulting variables were used for the construction of predictive models fitted to training sets of patients and compared to common algorithms.
Results:We had a mean 363 days of telemonitoring data from 135 patients. The two most practical traditional score-counting algorithms, restricted to cases with complete data resulted in AUC estimates of 0.60 [CI 95% 0.51, 0.69] and 0.58 [0.50, 0.67] for predicting admissions based on a single day's readings. However, in a real-world scenario allowing for missing data, with greater numbers of patient daily data and hospitalisations (N = 57,150, N + =17), the performance of all the traditional algorithms fell, including those based on two days data. One of the most frequently used algorithms performed no better than chance. Machine learning models demonstrated significant improvements; the best machine learning algorithm based on 57,150 episodes resulted in an aggregated AUC = 0.73 [0.67, 0.79]. Addition of weather data measurements resulted in a negligible improvement in the predictive performance of the best model (AUC = 0.74 [0.69, 0.79]). In order to achieve an 80% true positive rate (sensitivity), the traditional algorithms were associated with an 80% false positive rate: our algorithm halved this rate to approximately 40% (specificity approximately 60%). The machine learning algorithm was moderately superior to the best standard algorithm (AUC = 0.77 [0.74, 0.79] v AUC = 0.66 [0.63, 0.68]) at predicting the need for steroids.
Conclusions:The early detection and management of COPD remains an important goal given the huge personal and economic costs of the condition. Machine learning approaches, which can be tailored to an individual's baseline profile and can learn from experience of the individual patient are superior to existing predictive algorithms show promis...