2016
DOI: 10.1093/bioinformatics/btw623
|View full text |Cite
|
Sign up to set email alerts
|

Combining multiple tools outperforms individual methods in gene set enrichment analyses

Abstract: MotivationGene set enrichment (GSE) analysis allows researchers to efficiently extract biological insight from long lists of differentially expressed genes by interrogating them at a systems level. In recent years, there has been a proliferation of GSE analysis methods and hence it has become increasingly difficult for researchers to select an optimal GSE tool based on their particular dataset. Moreover, the majority of GSE analysis methods do not allow researchers to simultaneously compare gene set level resu… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
95
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 151 publications
(95 citation statements)
references
References 53 publications
0
95
0
Order By: Relevance
“…The single-cell RNA-Seq data from these studies represent an invaluable source to define gene sets made of enriched/specific markers of cells found in these organoids and in the two surveyed brain areas. EGSEA (Alhamdoosh et al, 2017) was used to find the closest matches to the differentially expressed genes found in BrainSpheres at the three time-points, using a significance score (from zero to 100, with zero being the least significant and 100 representing the most significant gene sets. Heatmap of EGSEA for 123 gene sets obtained from the single-cell RNA-Seq datasets showed high scores at the three time-points for the three following gene sets: xiang_mgeo_72days_8 (minimum score = 97, p.adj = 2e-20), quadrato_6months_14 (minimum score = 87, p.adj = 2e-14, and sloan_130days_1 (minimum score = 88, p.adj = 2e-20) (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…The single-cell RNA-Seq data from these studies represent an invaluable source to define gene sets made of enriched/specific markers of cells found in these organoids and in the two surveyed brain areas. EGSEA (Alhamdoosh et al, 2017) was used to find the closest matches to the differentially expressed genes found in BrainSpheres at the three time-points, using a significance score (from zero to 100, with zero being the least significant and 100 representing the most significant gene sets. Heatmap of EGSEA for 123 gene sets obtained from the single-cell RNA-Seq datasets showed high scores at the three time-points for the three following gene sets: xiang_mgeo_72days_8 (minimum score = 97, p.adj = 2e-20), quadrato_6months_14 (minimum score = 87, p.adj = 2e-14, and sloan_130days_1 (minimum score = 88, p.adj = 2e-20) (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…GSEA, using Kyoto Encyclopedia of Genes and Genomes (KEGG) as the reference database, was employed to provide biological pathway relevance to the DEA analysis. To avoid methodology associated biases, a total of eight GSEA algorithms ( ora , camera , roast , fry , plage , zscore , gsva , ssgsea ) were integrated using Bioconductor's Ensemble Gene Set Testing workflow . This workflow utilizes the linear model generated by limma as input, which includes the β‐coefficient and p ‐value of the cell:generation interaction term of every gene in the count matrix.…”
Section: Methodsmentioning
confidence: 99%
“…To further increase the statistical robustness, the Benjamini–Hochberg false discovery rate (FDR) procedure was performed on the obtained P avg (γ i ) to correct any multiple testing related type‐I errors. In addition, to illustrate the overall magnitude of changes in gene expression for a given gene set γ i , an average log FC value was calculated as 1false|γifalse|i=1false|γifalse|false|logFCjfalse|, where the log FC j is the log2 of the fold‐change of the j th gene in γ i . The log FC j equals the β‐coefficient of the interaction term between Cell Line A and generation in the limma linear regression model and reflects the change of gene j for its expression level in Cell Line A versus B over the generational time course that was studied.…”
Section: Methodsmentioning
confidence: 99%
“…Multiple approaches were applied to PLOS ONE examine responses at the phylogroup and individual strain levels and also due to variability among some treatment replicates (Table 1). We used an Ensemble of Gene Set Enrichment Analyses (EGSEA; [62]) to identify sets of genes that collectively were significantly differentially expressed based on a consensus of twelve GSEA algorithms (Benjamini-Hochberg adjusted p-value < 0.01, unadjusted p-value calculated using Wilkinson's method [64] to combine p-values from the GSEA algorithms). We defined each gene set to contain MicroTOOLs gene targets from a specific phylogroup and physiological response, e.g.…”
Section: Microarray Analysesmentioning
confidence: 99%