2022
DOI: 10.1111/rssc.12536
|View full text |Cite
|
Sign up to set email alerts
|

Outcome-Guided Sparse K-Means for Disease Subtype Discovery via Integrating Phenotypic Data with High-Dimensional Transcriptomic Data

Abstract: The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from existing approaches are not necessarily associated with clinical outcomes.With the rich clinical data along with the omics data in modern epidemiology cohorts, it is urgent to develop an outcome-guided clustering algorithm to fully integrate the phenotypic data with the high-dimensional omics data. Hence, we extende… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 54 publications
(64 reference statements)
0
4
0
Order By: Relevance
“…The Cancer Genome Atlas Networks (TCGA) group further demonstrated these four subtypes of breast cancer via multi-omics datasets, including genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, and microRNA sequencing ( Cancer Genome Atlas Network 2012 ). Using omics datasets to identify disease subtypes has been extensively studied in many complex diseases beyond breast cancer, such as leukemia ( Golub et al 1999 , Bullinger et al 2004 ), lymphoma ( Alizadeh et al 2000 , Rosenwald et al 2002 ), colorectal cancer ( Mo et al 2013 , Sadanandam et al 2013 ), and Alzheimer’s disease ( Bredesen 2015 , Meng et al 2022 ).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The Cancer Genome Atlas Networks (TCGA) group further demonstrated these four subtypes of breast cancer via multi-omics datasets, including genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, and microRNA sequencing ( Cancer Genome Atlas Network 2012 ). Using omics datasets to identify disease subtypes has been extensively studied in many complex diseases beyond breast cancer, such as leukemia ( Golub et al 1999 , Bullinger et al 2004 ), lymphoma ( Alizadeh et al 2000 , Rosenwald et al 2002 ), colorectal cancer ( Mo et al 2013 , Sadanandam et al 2013 ), and Alzheimer’s disease ( Bredesen 2015 , Meng et al 2022 ).…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, other noisy patterns within omics datasets may compromise the power to identify clinically meaningful subtypes. For example, in genetics studies, clusters could be driven by sex-related genes, age-related genes, or other unknown confounders-related genes instead of disease-related genes that interest us ( Meng et al 2022 ). Here, noisy clusters refer to those driven by nondisease-related biomarkers, as illustrated in Supplementary Fig.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The study will focus on autoantibody-positive relatives of people with type 1 diabetes. There is a lack of literature on defining clusters among individuals at risk of type 1 diabetes, and previous studies on clustering analysis of diabetes subtypes are limited to unsupervised clustering methods which may be less efficient in capturing the key variables that inform disease risks 11 . The proposed Page 7 of 28 method is a nonparametric approach (meaning that it does rely on assumptions about the data distribution or linear relationships) for identifying clusters of individuals informed by several key risk factors ascertained from the data.…”
Section: Introductionmentioning
confidence: 99%