2017
DOI: 10.1186/s12859-017-1619-7
|View full text |Cite
|
Sign up to set email alerts
|

Meta-analysis approach as a gene selection method in class prediction: does it improve model performance? A case study in acute myeloid leukemia

Abstract: BackgroundAggregating gene expression data across experiments via meta-analysis is expected to increase the precision of the effect estimates and to increase the statistical power to detect a certain fold change. This study evaluates the potential benefit of using a meta-analysis approach as a gene selection method prior to predictive modeling in gene expression data.ResultsSix raw datasets from different gene expression experiments in acute myeloid leukemia (AML) and 11 different classification methods were u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
4
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 52 publications
0
4
0
Order By: Relevance
“…Several systematic approaches and guidelines have been proposed for selecting important biomarker candidates across experiments [12]. In the circumstance that a single gene expression analysis might not provide a reliable and generalizable conclusion, the quantitative analysis of the combined datasets from multiple sources and technologies appears to be an efficient solution to increase the sample size and enhance the statistical power and thus possibly helps identify more clinically relevant biomarkers [13,14]. Choosing sufficient control samples and appropriate data mining techniques, removing batch effects across different platforms and studies are required to ensure a robust process of biomarker discovery and validation [15].…”
Section: Introductionmentioning
confidence: 99%
“…Several systematic approaches and guidelines have been proposed for selecting important biomarker candidates across experiments [12]. In the circumstance that a single gene expression analysis might not provide a reliable and generalizable conclusion, the quantitative analysis of the combined datasets from multiple sources and technologies appears to be an efficient solution to increase the sample size and enhance the statistical power and thus possibly helps identify more clinically relevant biomarkers [13,14]. Choosing sufficient control samples and appropriate data mining techniques, removing batch effects across different platforms and studies are required to ensure a robust process of biomarker discovery and validation [15].…”
Section: Introductionmentioning
confidence: 99%
“…For instance, leukemia has 7129 gene dimensions but only 72 samples for analysis, which increases the challenges of feature dimensionality reduction. Various methods exist for gene selection, but they are computationally costly and complex [15][16][17][18]. Many researchers have introduced information to facilitate gene selection, but these methods cannot properly reflect the length of a set of gene features, namely, the number of dimensions after dimensionality reduction.…”
Section: Introductionmentioning
confidence: 99%
“…Supervised learning is used to identify genes related to known categories such as cancer type or clinical outcome, and unsupervised learning is used to explore the similarity of gene expression patterns 6 . A large number of supervised learning have been used to explore hematological malignancies, including weighted voting, k-nearest neighbors, support vector machines, artificial neural networks, decision trees, random forest, and nearest shrunken centroid algorithms 7–10 . For unsupervised learning, K-means clustering, principal component analysis, nonnegative matrix factorization, and weighted co-expression network analysis (WGCNA) have been widely used to investigate hematological malignancies 11–14 .…”
mentioning
confidence: 99%
“…6 A large number of supervised learning have been used to explore hematological malignancies, including weighted voting, k-nearest neighbors, support vector machines, artificial neural networks, decision trees, random forest, and nearest shrunken centroid algorithms. [7][8][9][10] For unsupervised learning, Kmeans clustering, principal component analysis, nonnegative matrix factorization, and weighted co-expression network analysis (WGCNA) have been widely used to investigate hematological malignancies. [11][12][13][14] However, when conducting biological exploration, no formal classifier is needed to observe the correlation between 2 genes and investigate 1 gene's effect on the prognosis of patients.…”
mentioning
confidence: 99%