2022
DOI: 10.1080/01621459.2022.2116331
|View full text |Cite
|
Sign up to set email alerts
|

Selective Inference for Hierarchical Clustering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
62
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 49 publications
(62 citation statements)
references
References 30 publications
0
62
0
Order By: Relevance
“…Therefore, we judged that the data-driven spatial domains identified from BayesSpace were not sufficiently reliable to use for the downstream analyses, and instead proceeded with the histology-driven manual annotations for all further analyses. In addition, we note that using the manual annotations avoids potential issues due to inflated false discoveries resulting from circularity when performing differential gene expression testing between sets of cells or spots defined by unsupervised clustering, when the same genes are used for both clustering and differential testing [37]. Next, in addition to the manually annotated LC regions, we also manually annotated a set of individual spots that overlapped with NE neuron cell bodies identified within the LC regions, based on pigmentation, cell size, and morphology from the H&E histology images ( Supplementary Figure 6A ).…”
Section: Resultsmentioning
confidence: 99%
“…Therefore, we judged that the data-driven spatial domains identified from BayesSpace were not sufficiently reliable to use for the downstream analyses, and instead proceeded with the histology-driven manual annotations for all further analyses. In addition, we note that using the manual annotations avoids potential issues due to inflated false discoveries resulting from circularity when performing differential gene expression testing between sets of cells or spots defined by unsupervised clustering, when the same genes are used for both clustering and differential testing [37]. Next, in addition to the manually annotated LC regions, we also manually annotated a set of individual spots that overlapped with NE neuron cell bodies identified within the LC regions, based on pigmentation, cell size, and morphology from the H&E histology images ( Supplementary Figure 6A ).…”
Section: Resultsmentioning
confidence: 99%
“…Supervised feature selection, which involves filtering features that are based on association with the outcome, is another form of preprocessing that can cause leakage when performed outside cross validation. More broadly, recent work on post selection inference highlights the problem of performing statistical analyses such as differential expression after clustering 72 , even if the clusters were defined on independent datasets 73 .…”
Section: Pitfall 4: Leaky Preprocessingmentioning
confidence: 99%
“…Whilst our approach has been developed for changepoint problems, the general idea can be applied to other scenarios such as clustering (Gao et al, 2022;Chen and Witten, 2022) or regression tress (Neufeld et al, 2022). For example, current methods for post-selection inference after clustering are based on a test statistic that compares the mean of the cluster, and fixed the projection of the data that is orthogonal to this.…”
Section: Discussionmentioning
confidence: 99%