2017
DOI: 10.1093/nar/gkx204
|View full text |Cite
|
Sign up to set email alerts
|

Differential expression analysis for RNAseq using Poisson mixed models

Abstract: Identifying differentially expressed (DE) genes from RNA sequencing (RNAseq) studies is among the most common analyses in genomics. However, RNAseq DE analysis presents several statistical and computational challenges, including over-dispersed read counts and, in some settings, sample non-independence. Previous count-based methods rely on simple hierarchical Poisson models (e.g. negative binomial) to model independent over-dispersion, but do not account for sample non-independence due to relatedness, populatio… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
68
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 66 publications
(71 citation statements)
references
References 139 publications
(252 reference statements)
3
68
0
Order By: Relevance
“…Despite this promising outlook, analytic methods remain insufficient for achieving truly personalized medicine. The standard procedure is to evaluate one gene at a time, which results in low statistical power to identify the disease-associated genes [28]. Thus, more accurate models that leverage the large amounts of genomic data now available are in great demand.…”
Section: Real Data Studymentioning
confidence: 99%
“…Despite this promising outlook, analytic methods remain insufficient for achieving truly personalized medicine. The standard procedure is to evaluate one gene at a time, which results in low statistical power to identify the disease-associated genes [28]. Thus, more accurate models that leverage the large amounts of genomic data now available are in great demand.…”
Section: Real Data Studymentioning
confidence: 99%
“…However, analyzing normalized expression data can be suboptimal as this approach fails to account for the mean-variance relationship existed in raw counts, leading to a potential loss of power 14 . Indeed, similar loss of power has been well documented for methods that can only analyze normalized data in many other omics sequencing studies 15,16 . Besides direct modeling of count data, identifying SE genes also requires the development of statistical methods that can produce well calibrated p-values to ensure proper control of type I error.…”
Section: Introductionmentioning
confidence: 64%
“…We then used the hierarchical agglomerative clustering algorithm in the R package amap (v0. [8][9][10][11][12][13][14][15][16][17] to cluster identified SE genes detected by SPARK into five groups. Afterwards, we summarized the gene expression patterns by using the expression level of the five cluster centers (Fig.…”
Section: Clustering Se Genes Detected By Sparkmentioning
confidence: 99%
“…These properties make pASTA a powerful tool to identify candidate SNPs that are potentially associated with the trait(s) of interest in the screening step of a GWAS, which then can be followed by deeper characterization/interpretation of the associations through other methods, such as biological validation. For example, RNA sequencing experiments can be performed to examine whether the candidate SNPs affect the expression level of the target gene [34]. Crispr-Cas9 knockout screening can be carried out in human cell lines to determine the functional significance of these candidate SNPs [35].…”
Section: Discussionmentioning
confidence: 99%