Spectral Clustering Strategies for Heterogeneous Disease Expression Data

Huang, Grace; Cunningham, Kathryn I.; Benos, Panayiotis; Chennubhotla, Chakra

doi:10.1142/9789814447973_0021

Cited by 8 publications

(7 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, primary feature selection here is equivalent to discovering the PC(T) . For generating the tree structure we use ReKS ( Recursive K-means Spectral Clustering ), which was shown to outperform other methods in terms of speed or efficiency and outputs more balanced trees when applied to heterogeneous clinical data (10). Finally, to create the representative features of a cluster we tested the first Principal Component of the cluster, the medoid, and the centroid of the clustered variables.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

T-Recs: Stable Selection of Dynamically Formed Groups of Features With Application to Prediction of Clinical Outcomes

Huang

Tsamardinos²,

Raghu

et al. 2014

Biocomputing 2015

Self Cite

View full text Add to dashboard Cite

Feature selection is used extensively in biomedical research for biomarker identification and patient classification, both of which are essential steps in developing personalized medicine strategies. However, the structured nature of the biological datasets and high correlation of variables frequently yield multiple equally optimal signatures, thus making traditional feature selection methods unstable. Features selected based on one cohort of patients, may not work as well in another cohort. In addition, biologically important features may be missed due to selection of other co-clustered features We propose a new method, Tree-guided Recursive Cluster Selection (T-ReCS), for efficient selection of grouped features. T-ReCS significantly improves predictive stability while maintains the same level of accuracy. T-ReCS does not require an a priori knowledge of the clusters like group-lasso and also can handle "orphan" features (not belonging to a cluster). T-ReCS can be used with categorical or survival target variables. Tested on simulated and real expression data from breast cancer and lung diseases and survival data, T-ReCS selected stable cluster features without significant loss in classification accuracy.

show abstract

Section: Methodsmentioning

confidence: 99%

“…The complexity of T-ReCS is roughly O(|φ | 2 ). Specifically, ReKS is O(|φ | 2 ) (10), MMPC is O(|φ |•| PC(T) |• k ), and conditional independence tests for ascending the tree is O( log |φ |•| PC(T) |). We note, however, that selection of different methods for single feature selection and tree construction can alter this complexity.…”

Section: Methodsmentioning

confidence: 99%

T-Recs: Stable Selection of Dynamically Formed Groups of Features With Application to Prediction of Clinical Outcomes

Huang

Tsamardinos²,

Raghu

et al. 2014

Biocomputing 2015

Self Cite

View full text Add to dashboard Cite

show abstract

“…Another effective clustering method was proposed for heterogeneous disease expression data [17]. Recursive Kmeans spectral clustering method (ReKS) was developed, which was found to be superior to the hierarchical clustering method and much faster than k-means.…”

Section: Related Work and Theoretical Backgroundmentioning

confidence: 99%

Image Clustering Method based on Particle Swarm Optimization

Kim

Matveeva

Viksnin

et al. 2018

Annals of Computer Science and Information Systems

View full text Add to dashboard Cite

To implement efficient computer vision mechanisms, efficient image clustering methods are important. The paper elaborates a clustering method based on particle swarm optimization (PSO) which provides automatic establishment of clustering parameters. The developed PSO based clustering method was tested on 860 images for a car vision system and its results and contribution to the pattern recognition quality improvement were assessed in comparison with fuzzy C-means and k-means. The results do not differ significantly, but distinction in average time of work for these methods was noted. The PSO clustering method is faster than k-means and slower than fuzzy C-means. However, fuzzy C-means method does not guarantee correct results during the further analysis, so the PSO clustering method can be more efficient for implementation in computer vision systems.

show abstract

“…To capture underlying structure in the history of present illness section from patients EHR, Henao [14] proposed a statistical model that groups patients based on text data in the initial history of present illness (HPI) and final diagnosis (DX) of a patients EHR. For human disease gene expression, Huang [15] presented a new recursive Kmeans spectral clustering method (ReKS) to efficient cluster human diseases. Most of these research have demonstrate effectiveness of their model with real-world experiments, that convinces us of the applicability of clustering patients on cohorts discovering.…”

Section: A Patient Similaritymentioning

confidence: 99%

“…The parameters of our deep learning were as follow: the width of the convolution filters w is set to 5,10,15,20,25, and the number of convolutional feature maps m takes on 50, 100, 150, 200. We use stochastic gradient descent to optimize the model's parameters.…”

Section: Experimental Settingsmentioning

confidence: 99%

Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding

Zhu

Yin

Qian

et al. 2016

2016 IEEE 16th International Conference on Data Mining (ICDM)

100

View full text Add to dashboard Cite

Evaluating the clinical similarities between pairwise patients is a fundamental problem in healthcare informatics. A proper patient similarity measure enables various downstream applications, such as cohort study and treatment comparative effectiveness research. One major carrier for conducting patient similarity research is the Electronic Health Records(EHRs), which are usually heterogeneous, longitudinal, and sparse. Though existing studies on learning patient similarity from EHRs have shown being useful in solving real clinical problems, their applicability is limited due to the lack of medical interpretations. Moreover, most previous methods assume a vector based representation for patients, which typically requires aggregation of medical events over a certain time period. As a consequence, the temporal information will be lost. In this paper, we propose a patient similarity evaluation framework based on temporal matching of longitudinal patient EHRs. Two efficient methods are presented, unsupervised and supervised, both of which preserve the temporal properties in EHRs. The supervised scheme takes a convolutional neural network architecture, and learns an optimal representation of patient clinical records with medical concept embedding. The empirical results on real-world clinical data demonstrate substantial improvement over the baselines. We make our code and sample data available for further study. 1

show abstract

Spectral Clustering Strategies for Heterogeneous Disease Expression Data

Cited by 8 publications

References 28 publications

T-Recs: Stable Selection of Dynamically Formed Groups of Features With Application to Prediction of Clinical Outcomes

T-Recs: Stable Selection of Dynamically Formed Groups of Features With Application to Prediction of Clinical Outcomes

Image Clustering Method based on Particle Swarm Optimization

Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding

Contact Info

Product

Resources

About