2017
DOI: 10.1214/17-aoas1033
|View full text |Cite
|
Sign up to set email alerts
|

Integrative sparse $K$-means with overlapping group lasso in genomic applications for disease subtype discovery

Abstract: Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K-means (is-K means) approach to discover disease subtypes with the guidance of prior biological… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 26 publications
(22 citation statements)
references
References 53 publications
0
22
0
Order By: Relevance
“…In our study, we adopt the GAP approach that has been the most common choice in single dataset‐based clustering analysis. It has been also suggested in recent clustering studies with multidimensional data and shown to behave well in numerical analyses . Thus, it may be reasonable to conjecture that the GAP statistic is also valid for multidimensional data.…”
Section: Discussionmentioning
confidence: 73%
See 1 more Smart Citation
“…In our study, we adopt the GAP approach that has been the most common choice in single dataset‐based clustering analysis. It has been also suggested in recent clustering studies with multidimensional data and shown to behave well in numerical analyses . Thus, it may be reasonable to conjecture that the GAP statistic is also valid for multidimensional data.…”
Section: Discussionmentioning
confidence: 73%
“…It has been also suggested in recent clustering studies with multidimensional data and shown to behave well in numerical analyses. 44,45 Thus, it may be reasonable to conjecture that the GAP statistic is also valid for multidimensional data. Detailed studies are deferred to future investigation.…”
Section: Discussionmentioning
confidence: 99%
“…IS- K means Huo and Tseng [97] has developed an integrative sparse K-means approach with overlapping group LASSO to perform omics feature selection and cancer subtype identification. The formulation of feature group is flexible, which can be the ones from multi-level omics data (such as mRNA, CNV and Methylation) with the same cis-regulatory information or from the pathway-guided clustering scenario.…”
Section: Multi-omics Data Integrationmentioning
confidence: 99%
“…It has rooted in the dual ascent and augmented Lagrangian methods from convex optimization, yet combines the strength of both. ADMM can handle multiple constraints in optimization, which is of great importance in integrating multi-level omics data as the complex data structure and the way of conducting integration can be modelled by imposing constraints to the objective function [68,97]. In addition, EM algorithm plays an important role in traditional clustering analysis, such as the K-means clustering since whether a sample belongs to a certain cluster can be treated as a missing data problem.…”
Section: Multi-omics Data Integrationmentioning
confidence: 99%
“…In the K-cluster method, a cluster is included in the collection of data with specific similarities. The K-cluster may not appropriate enough for small sizes and density of clusters; however, this method is well-scaled for large data sets and is considered the fastest technique of clustering [ 58 ]. Overall, unsupervised learning is mainly used to find patterns and clustering data set which are not known before in the dataset.…”
Section: Machine Learning In Nutrition Studiesmentioning
confidence: 99%