Background: Vascular access surveillance of dialysis patients is a challenging task for clinicians. We derived and validated an arteriovenous fistula failure model (AVF-FM) based on machine learning. Methods: The AVF-FM is an XG-Boost algorithm aimed at predicting AVF failure within three months among in-centre dialysis patients. The model was trained in the derivation set (70% of initial cohort) by exploiting the information routinely collected in the Nephrocare European Clinical Database (EuCliD®). Model performance was tested by concordance statistic and calibration charts in the remaining 30% of records. Features importance was computed using the SHAP method. Results: We included 13,369 patients, overall. The Area Under the ROC Curve (AUC-ROC) of AVF-FM was 0.80 (95% CI 0.79–0.81). Model calibration showed excellent representation of observed failure risk. Variables associated with the greatest impact on risk estimates were previous history of AVF complications, followed by access recirculation and other functional parameters including metrics describing temporal pattern of dialysis dose, blood flow, dynamic venous and arterial pressures. Conclusions: The AVF-FM achieved good discrimination and calibration properties by combining routinely collected clinical and sensor data that require no additional effort by healthcare staff. Therefore, it can potentially enable risk-based personalization of AVF surveillance strategies.
In this paper, we propose POTATOES (Partitioning Over-fiTting AuTOencoder EnSemble) a new type of autoencoder ensembles for unsupervised outlier detection. Autoencoders are a popular method for this type of problem, especially if the data is located near a submanifold of smaller dimension than that of the ambient space. The standard approach is to approximate the data with the decoder submanifold of the autoencoder and to use the reconstruction error as anomaly score. However, one of the main problems is often to find the right amount of regularization. If the regularization is too strong, the data is underfitted and we obtain many false positives. If the regularization is too weak, the data is overfitted which results in false negatives. The remedy we propose is to not regularize at all, but to rather randomly partition the data into sufficiently many equally sized parts, overfit each part with its own autoencoder, and to use the maximum over all autoencoder reconstruction errors as the anomaly score. We apply our model to realistic data and show that it often outperforms current outlier detection methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.