2019
DOI: 10.1093/biostatistics/kxz001
|View full text |Cite
|
Sign up to set email alerts
|

Are clusterings of multiple data views independent?

Abstract: In the Pioneer 100 (P100) Wellness Project (Price and others, 2017), multiple types of data are collected on a single set of healthy participants at multiple timepoints in order to characterize and optimize wellness. One way to do this is to identify clusters, or subgroups, among the participants, and then to tailor personalized health recommendations to each subgroup. It is tempting to cluster the participants using all of the data types and timepoints, in order to fully exploit the available information. How… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
34
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(35 citation statements)
references
References 23 publications
1
34
0
Order By: Relevance
“…Park & Lock (2020) jointly analyzed multiple data sets for heterogeneous groups of objects with heterogeneous feature sets. Gao et al (2020) and Wang & Allen (2019) considered clustering problems for multi-view data. Li et al (2018b) proposed a regression model with multi-view data as covariates.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Park & Lock (2020) jointly analyzed multiple data sets for heterogeneous groups of objects with heterogeneous feature sets. Gao et al (2020) and Wang & Allen (2019) considered clustering problems for multi-view data. Li et al (2018b) proposed a regression model with multi-view data as covariates.…”
Section: Introductionmentioning
confidence: 99%
“…Another example is in the study of Alzheimer’s disease where recent efforts have been focusing on combining brain imaging data, genetic data, as well as clinical outcomes in predicting disease (Nathoo and others , 2019). A third example is a large study profiling different states of wellness, where genetic, proteomic and metabolic data among other types of data are collected (Gao and others , 2020). Together, the different types of data provide a more comprehensive picture which has the potential of better characterizing and optimizing what it is to be healthy.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…As should be clear, the distances may be meaningful, but there is no clear way to say which distances are associated with a group of malicious observations. We might consider testing the difference, but, as Gao, Bien, and Witten highlighted recently [22], this should be considered double-dipping. The clusters are generated expressly to identify a difference.…”
Section: Ica and Clustering Discussionmentioning
confidence: 99%
“…In threat detection, we would prefer to capture all malicious traffic, even if that requires more false positives. Gao and colleagues [22] used adaptive control version of PCA for feature extraction alongside an incremental extreme learning machine as part of an IDS. Li et al proposed an anomaly detection framework for network traffic that used PCA to extract features and reduce redundancy.…”
Section: Literature Reviewmentioning
confidence: 99%