Estimating classification probabilities in high-dimensional diagnostic studies

Appel, Inka J.; Gronwald, Wolfram; Spang, Rainer

doi:10.1093/bioinformatics/btr434

Cited by 4 publications

(1 citation statement)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The predicted RF scores resulting from predicting the one-third test data of the outer CV loop are then recalibrated by applying the calibration model that was fitted on the RF scores generated during the nested CV. A similar CV scheme was used by Appel et al 46 to validate estimated classification probabilities.…”

Section: Methods (Online Only)mentioning

confidence: 99%

DNA methylation-based classification of central nervous system tumours

Capper

Jones

Sill

et al. 2018

Nature

2,165

2,207

View full text Add to dashboard Cite

Summary Accurate pathological diagnosis is crucial for optimal management of cancer patients. For the ~100 known central nervous system (CNS) tumour entities, standardization of the diagnostic process has been shown to be particularly challenging - with substantial inter-observer variability in the histopathological diagnosis of many tumour types. We herein present the development of a comprehensive approach for DNA methylation-based CNS tumour classification across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that availability of this method may have substantial impact on diagnostic precision compared with standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility we have designed a free online classifier tool (www.molecularneuropathology.org) requiring no additional onsite data processing. Our results provide a blueprint for the generation of machine learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology.

show abstract

Section: Methods (Online Only)mentioning

confidence: 99%

DNA methylation-based classification of central nervous system tumours

Capper

Jones

Sill

et al. 2018

Nature

2,165

2,207

View full text Add to dashboard Cite

show abstract

Classification of samples from NMR-based metabolomics using principal components analysis and partial least squares with uncertainty estimation

2018

View full text Add to dashboard Cite

Recent progress in metabolomics has been aided by the development of analysis techniques such as gas and liquid chromatography coupled with mass spectrometry (GC-MS and LC-MS) and nuclear magnetic resonance (NMR) spectroscopy. The vast quantities of data produced by these techniques has resulted in an increase in the use of machine algorithms that can aid in the interpretation of this data, such as principal components analysis (PCA) and partial least squares (PLS). Techniques such as these can be applied to biomarker discovery, interlaboratory comparison, and clinical diagnoses. However, there is a lingering question whether the results of these studies can be applied to broader sets of clinical data, usually taken from different data sources. In this work, we address this question by creating a metabolomics workflow that combines a previously published consensus analysis procedure ( https://doi.org/10.1016/j.chemolab.2016.12.010 ) with PCA and PLS models using uncertainty analysis based on bootstrapping. This workflow is applied to NMR data that come from an interlaboratory comparison study using synthetic and biologically obtained metabolite mixtures. The consensus analysis identifies trusted laboratories, whose data are used to create classification models that are more reliable than without. With uncertainty analysis, the reliability of the classification can be rigorously quantified, both for data from the original set and from new data that the model is analyzing. Graphical abstract ᅟ.

show abstract