This prognostic study evaluates whether psychosis transition can be predicted in patients with clinical high-risk syndromes or recent-onset depression by multimodal machine learning that optimally integrates clinical and neurocognitive data, structural magnetic resonance imaging, and polygenic risk scores for schizophrenia.
Identifying distinctive subtypes of schizophrenia could ultimately enhance diagnostic and prognostic accuracy. We aimed to uncover neuroanatomical subtypes of chronic schizophrenia patients to test whether stratification can enhance computer-aided discrimination of patients from control subjects. Unsupervised, data-driven clustering of structural MRI (sMRI) data was used to identify 2 subtypes of schizophrenia patients drawn from a US-based open science repository (n = 71) and we quantified classification improvements compared to controls (n = 74) using supervised machine learning. We externally validated the unsupervised and supervised learning models in a heterogeneous German validation sample (n = 316), and characterized symptom, cognition, and longitudinal symptom change signatures. Stratification improved classification accuracies from 68.5% to 73% (subgroup 1) and 78.8% (subgroup 2), respectively. Increased accuracy was also found when models were externally validated, and an average gain of 9% was found in supplementary analyses. The first subgroup was associated with cortical and subcortical volume reductions coupled with substantially longer illness duration, whereas the second subgroup was mainly characterized by cortical reductions, reduced illness duration, and comparatively less negative symptoms. Individuals within each subgroup could be identified using just 10 clinical questions at an accuracy of 81.2%, and differential cognitive and symptom course signatures were suggested in multivariate analyses. Our findings suggest that sMRI-based subtyping enhances the neuroanatomical discrimination of schizophrenia by identifying generalizable brain patterns that align with a clinical staging model of the disorder. These findings could be used to improve illness stratification for biomarker-based computer-aided diagnoses.
BACKGROUND: The clinical high risk (CHR) paradigm has facilitated research into the underpinnings of help-seeking individuals at risk for developing psychosis, aiming at predicting and possibly preventing transition to the overt disorder. Statistical methods such as machine learning and Cox regression have provided the methodological basis for this research by enabling the construction of diagnostic models (i.e., distinguishing CHR individuals from healthy individuals) and prognostic models (i.e., predicting a future outcome) based on different data modalities, including clinical, neurocognitive, and neurobiological data. However, their translation to clinical practice is still hindered by the high heterogeneity of both CHR populations and methodologies applied. METHODS: We systematically reviewed the literature on diagnostic and prognostic models built on Cox regression and machine learning. Furthermore, we conducted a meta-analysis on prediction performances investigating heterogeneity of methodological approaches and data modality. RESULTS: A total of 44 articles were included, covering 3707 individuals for prognostic studies and 1052 individuals for diagnostic studies (572 CHR patients and 480 healthy control subjects). CHR patients could be classified against healthy control subjects with 78% sensitivity and 77% specificity. Across prognostic models, sensitivity reached 67% and specificity reached 78%. Machine learning models outperformed those applying Cox regression by 10% sensitivity. There was a publication bias for prognostic studies yet no other moderator effects. CONCLUSIONS: Our results may be driven by substantial clinical and methodological heterogeneity currently affecting several aspects of the CHR field and limiting the clinical implementability of the proposed models. We discuss conceptual and methodological harmonization strategies to facilitate more reliable and generalizable models for future clinical practice.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.