2022
DOI: 10.1101/2022.09.16.508259
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Direct inference and control of genetic population structure from RNA sequencing data

Abstract: RNAseq data can be used to infer genetic variants, yet its use for estimating genetic population structure remains underexplored. Here, we construct a freely available computational tool (RGStraP) to estimate RNAseq-based genetic principal components (RG-PCs) and assess whether RG-PCs can be used to control for population structure in gene expression analyses. Using whole blood samples from understudied Nepalese populations, we show that RG-PCs had comparable results to paired array-based genotypes, with high … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 41 publications
(39 reference statements)
0
1
0
Order By: Relevance
“…High-throughput RNA sequencing (RNA-seq) has been frequently applied for measuring gene expression levels (1), assembling de novo transcriptomes (2), detecting copy number alterations (3), and identifying genomic variants that influence gene expression (4). Genotypes called from RNA-seq have also been used to determine population structure (5,6). Historically, RNA-seq has been viewed as unreliable input for performing genetic variation calling for several reasons: RNA-seq typically targets fewer fragments than DNA sequencing, messenger RNA (mRNA) only produces transcripts in coding regions (~1% of a mammalian genome), and a comparative lack of algorithmic development relative to DNA variant callers.…”
Section: Introductionmentioning
confidence: 99%
“…High-throughput RNA sequencing (RNA-seq) has been frequently applied for measuring gene expression levels (1), assembling de novo transcriptomes (2), detecting copy number alterations (3), and identifying genomic variants that influence gene expression (4). Genotypes called from RNA-seq have also been used to determine population structure (5,6). Historically, RNA-seq has been viewed as unreliable input for performing genetic variation calling for several reasons: RNA-seq typically targets fewer fragments than DNA sequencing, messenger RNA (mRNA) only produces transcripts in coding regions (~1% of a mammalian genome), and a comparative lack of algorithmic development relative to DNA variant callers.…”
Section: Introductionmentioning
confidence: 99%