Alzheimer's disease (AD) is associated with heterogeneous atrophy patterns, which are increasingly manifested throughout the disease course, driven by underlying neuropathologic processes. Herein, we show that manifestations of these brain changes in early asymptomatic stages can be detected via a novel deep semi-supervised representation learning method. We first identified two dominant dimensions of brain atrophy in symptomatic mild cognitive impairment (MCI) and AD patients: the diffuse-AD (R1) dimension shows widespread brain atrophy, and the MTL-AD (R2) dimension displays focal medial temporal lobe (MTL) atrophy. Critically, only R2 was associated with known genetic risk factors (e.g., APOE4) of AD in MCI and AD patients at baseline. We then showed that brain changes along these two dimensions were independently detected in early stages in a cohort representative of the general population and two cognitively unimpaired cohorts of asymptomatic participants. In the general population, genome-wide association studies found 77 genes unrelated to APOE differentially associated with R1 and R2. Functional analyses revealed that these genes were overrepresented in differentially expressed gene sets in organs beyond the brain (R1 and R2), including the heart (R1) and the pituitary gland, muscle, and kidney (R2). These genes were also enriched in biological pathways implicated in dendritic cells (R2), macrophage functions (R1), and cancer (R1 and R2). The longitudinal progression of R1 and R2 in the cognitively unimpaired populations, as well as in individuals with MCI and AD, showed variable associations with established AD risk factors, including APOE4, tau, and amyloid. Our findings deepen our understanding of the multifaceted pathogenesis of AD beyond the brain. In early asymptomatic stages, the two dimensions are associated with diverse pathological mechanisms, including cardiovascular diseases, inflammation, and hormonal dysfunction, driven by genes different from APOE, which collectively contribute to the early pathogenesis of AD.
Normal and pathologic neurobiological processes influence brain morphology in coordinated ways that give rise to patterns of structural covariance (PSC) across brain regions and individuals during brain aging and brain diseases. The genetic underpinnings of these patterns remain largely unknown. We apply a stochastic multivariate factorization method to a diverse population of 50,699 individuals (12 studies, 130 sites) and derive data-driven, multi-scale PSCs of regional brain size. PSCs were significantly correlated with 915 genomic loci in the discovery set, 617 of which are novel, and 72% were independently replicated. Key pathways influencing PSCs involved reelin signaling, apoptosis, neurogenesis, and appendage development, while pathways of breast cancer indicate potential interplays between brain metastasis and PSCs associated with neurodegeneration and dementia. Using machine learning, multi-scale PSCs effectively derive imaging signatures of several brain diseases. Our results elucidate new genetic and biological underpinnings that influence structural covariance patterns in the human brain.
Objective: While machine learning (ML) includes a valuable array of tools for analyzing biomedical data with multivariate and complex underlying associations, significant time and expertise is required to assemble effective, rigorous, comparable, reproducible, and unbiased pipelines. Automated ML (AutoML) tools seek to facilitate ML application by automating a subset of analysis pipeline elements. In this study we develop and validate a Simple, Transparent, End-to-end Automated Machine Learning Pipeline (STREAMLINE) and apply it to investigate the added utility of photography-based phenotypes for predicting obstructive sleep apnea (OSA); a common and underdiagnosed condition associated with a variety of health, economic, and safety consequences. Methods: STREAMLINE is designed to tackle biomedical binary classification tasks while (1) avoiding common mistakes, (2) accommodating complex associations and common data challenges, and (3) allowing scalability, reproducibility, and model interpretation. It automates the majority of established, generalizable, and reliably automatable aspects of an ML analysis pipeline while incorporating cutting edge algorithms and providing opportunities for human-in-the-loop customization. We present a broadly refactored and extended release of STREAMLINE, validating and benchmarking performance across simulated and real-world datasets. Then we applied STREAMLINE to evaluate the utility of demographics (DEM), self-reported comorbidities (DX), symptoms (SYM), and photography-based craniofacial (CF) and intraoral (IO) anatomy measures in predicting 'any OSA' or 'moderate/severe OSA' using 3,111 participants from Sleep Apnea Global Interdisciplinary Consortium (SAGIC). Results: Benchmarking analyses validated the efficacy of STREAMLINE across data simulations with increasingly complex patterns of association including epistatic interactions and genetic heterogeneity. OSA analyses identified a significant increase in ROC-AUC when adding CF to DEM+DX+SYM to predict 'moderate/severe' OSA. Additionally, a consistent but non-significant increase in PRC-AUC was observed with the addition of each subsequent feature set to predict 'any OSA', with CF and IO yielding minimal improvements. Conclusion: STREAMLINE is an effective, rigorous, transparent, and easy-to-use AutoML approach to a comparative ML analysis that adheres to best practices in data science. Application of STREAM-LINE to OSA data suggests that CF features provide additional value in predicting moderate/severe OSA, but neither CF nor IO features meaningfully improved the prediction of 'any OSA' beyond established demographics, comorbidity and symptom characteristics.Keywords automated machine learning • obstructive sleep apnea • data science • predictive modeling • craniofacial traits • intraoral anatomy user-specification of feature types (which cannot always be reliably automated) and one-hot-encoding of categorical features for modeling, (3) engineering of 'missingness features' to consider missingness as a potentially informati...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.