Background: Advanced machine learning methods combined with large sets of health screening data provide opportunities for diagnostic value in human and veterinary medicine.Hypothesis/Objectives: To derive a model to predict the risk of cats developing chronic kidney disease (CKD) using data from electronic health records (EHRs) collected during routine veterinary practice.Methods: Longitudinal EHRs from Banfield Pet Hospitals were extracted and randomly split into 2 parts. The first 67% of the data were used to build a prediction model, which included feature selection and identification of the optimal neural network type and architecture. The remaining unseen EHRs were used to evaluate the model performance.Results: The final model was a recurrent neural network (RNN) with 4 features (creatinine, blood urea nitrogen, urine specific gravity, and age). When predicting CKD near the point of diagnosis, the model displayed a sensitivity of 90.7% and a specificity of 98.9%. Model sensitivity decreased when predicting the risk of CKD with a longer horizon, having 63.0% sensitivity 1 year before diagnosis and 44.2% 2 years before diagnosis, but with specificity remaining around 99%.
Conclusions and clinical importance:The use of models based on machine learning can support veterinary decision making by improving early identification of CKD. K E Y W O R D S artificial neural network, computer model, feline, machine learning, renal
Part 2: Learning-Ensemble LearningInternational audienceAn ensemble of distributed neural network classifiers is composed when several different individual neural networks are trained based on their local training data. These classifiers can provide either a single class label prediction, or the normalized via the soft max real value class-outputs that represent posterior probabilities which give the confidence levels. To form the ensemble decision the individual classifier decisions can be combined via the well known majority (or plurality) voting that sums the votes for each class and selects the class that receives most of the votes. While the majority voting is the most popular combination rule many ties in votes can occur, especially in multi-class problems. Ties are usually broken either randomly where the unknown instance is assigned randomly to one of the tied classes or using the class proportions where the tied class with the largest proportion wins. We present a tie breaking strategy that uses soft max confidence accumulations. Every class accumulates a vote and a confidence for this vote. If a tie occurs then the tied class with the maximum confidence sum wins. The proposed tie breaking in the voting process performs very well in all cases of different data distributions on various benchmark datasets
The aim of this study was to derive a model to predict the risk of dogs developing chronic kidney disease (CKD) using data from electronic health records (EHR) collected during routine veterinary practice. Data from 57,402 dogs were included in the study. Two thirds of the EHRs were used to build the model, which included feature selection and identification of the optimal neural network type and architecture. The remaining unseen EHRs were used to evaluate model performance. The final model was a recurrent neural network with 6 features (creatinine, blood urea nitrogen, urine specific gravity, urine protein, weight, age). Identifying CKD at the time of diagnosis, the model displayed a sensitivity of 91.4% and a specificity of 97.2%. When predicting future risk of CKD, model sensitivity was 68.8% at 1 year, and 44.8% 2 years before diagnosis. Positive predictive value (PPV) varied between 15 and 23% and was influenced by the age of the patient, while the negative predictive value remained above 99% under all tested conditions. While the modest PPV limits its use as a stand-alone diagnostic screening tool, high specificity and NPV make the model particularly effective at identifying patients that will not go on to develop CKD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.