2020
DOI: 10.1186/s12864-020-6502-7
|View full text |Cite
|
Sign up to set email alerts
|

Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies

Abstract: Background: High-throughput RNA sequencing (RNA-seq) has evolved as an important analytical tool in molecular biology. Although the utility and importance of this technique have grown, uncertainties regarding the proper analysis of RNA-seq data remain. Of primary concern, there is no consensus regarding which normalization and statistical methods are the most appropriate for analyzing this data. The lack of standardized analytical methods leads to uncertainties in data interpretation and study reproducibility,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 32 publications
(31 citation statements)
references
References 72 publications
0
31
0
Order By: Relevance
“…All DE analyses were done with R software (version 3.5.3) and the edgeR package [24] (version 3.22.5). Trimmed-mean M values (TMM) normalization was performed to normalize the counts among the different samples [37][38][39][40]. As high dispersion of low counts interfered with some of the statistical approximations used in edgeR, genes with low counts were filtered out using the filterByExpr function as recommended in the user's guide.…”
Section: De Analysis Of the Collected Raw Count Datamentioning
confidence: 99%
“…All DE analyses were done with R software (version 3.5.3) and the edgeR package [24] (version 3.22.5). Trimmed-mean M values (TMM) normalization was performed to normalize the counts among the different samples [37][38][39][40]. As high dispersion of low counts interfered with some of the statistical approximations used in edgeR, genes with low counts were filtered out using the filterByExpr function as recommended in the user's guide.…”
Section: De Analysis Of the Collected Raw Count Datamentioning
confidence: 99%
“…However, most studies follow cohort analysis using standard statistical algorithms to determine DEGs, where various normalization methods followed by negative binomial distributions or Poisson are utilized to model the gene count data. Cutoff score based on P-value generated by statistical modeling is then applied along with expression change threshold [ 124 , 125 ]. This method of analysis has been successful in different ways, as they could identify biomarkers and prognostic markers and determine which genes are usually overexpressed or downregulated in certain cancer types [ 126 ].…”
Section: Discussionmentioning
confidence: 99%
“…Of course, many factors may promote cancer such as chemicals, radiation as well as genetic defects in reparation and replication molecular machinery. To gain inside into such a complex problem as a molecular approach of cancer together with a stillevolving protocol of RNA-seq treatment regarding normalization procedure or error rate (Li et al, 2020), a robust measure was The numbers in the table represent the proportion (%) of tumors of a given cancer type that showed the gene among the top-20 most connected proteins of the subnetwork of up-regulated genes. The pink color concerns up-regulated genes in at least 70% of tumor samples of each cancer type.…”
Section: Galaxy Pipelinementioning
confidence: 99%