2012
DOI: 10.1080/15287394.2012.674912
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Strategies for Sequential Validation of Significant Features from High-Dimensional Genomic Data

Abstract: High-dimensional genomic studies play a key role in identifying critical features that are significantly associated with a phenotypic outcome. The two most important examples are the detection of (1) differentially expressed genes from genome-wide gene expression studies and (2) single-nucleotide polymorphisms (SNPs) from genome-wide association studies. Such experiments are often associated with high noise levels, and the validity of statistical conclusions suffers from low sample size compared to large numbe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 10 publications
0
6
0
Order By: Relevance
“…It suggested that elevated expression of these three genes were associated with poor OS among the smoking related lung adenocarcinoma patients. However, there is a puzzle that the identified genes in training cohorts could not easily be validated in external cohorts 41. One reason might be the effects of genes have broad confidence intervals so that it is difficult to identified using a single validation database.…”
Section: Discussionmentioning
confidence: 99%
“…It suggested that elevated expression of these three genes were associated with poor OS among the smoking related lung adenocarcinoma patients. However, there is a puzzle that the identified genes in training cohorts could not easily be validated in external cohorts 41. One reason might be the effects of genes have broad confidence intervals so that it is difficult to identified using a single validation database.…”
Section: Discussionmentioning
confidence: 99%
“…One strategy to address the problem of multiple testing is sequential validation, where significant genes identified in a discovery set enter as candidates in a validation set. We previously recommended an optimized order for such a stepwise procedure, where the datasets with the largest sample size (and the lowest measurement variance) are used for discovery steps and the datasets with the smallest sample size for validation steps [32]. Based on this approach, the Rotterdam and Transbig cohorts were here used for gene discovery and the Mainz cohort for validation.…”
Section: Discussionmentioning
confidence: 99%
“…As a consequence, there is high confidence in the relevance of the final candidates. However, it has been shown that this approach yields many false negative results (19). Instead, we applied in this study a combined sequential and meta-analysis approach, in which candidate biomarkers with prognostic relevance in the primary cohort were further evaluated in a meta-analysis of several publicly available datasets.…”
Section: Discussionmentioning
confidence: 99%
“…Surprisingly, there is virtually no overlap between hitherto published gene signatures. An explanation might be that effects of single genes with broad confidence intervals are difficult to confirm using a sequential validation strategy, that is, when genes identified as significant in one study are tested for significance in separate subsequent studies of comparably small sample size (19).…”
Section: Introductionmentioning
confidence: 99%