Background
The discovery that somatic cells can be reprogrammed to induced pluripotent stem cells (iPSCs) has provided a foundation for in vitro human disease modelling, drug development and population genetics studies. Gene expression plays a critical role in complex disease risk and therapeutic response. However, while the genetic background of reprogrammed cell lines has been shown to strongly influence gene expression, the effect has not been evaluated at the level of individual cells which would provide significant resolution. By integrating single cell RNA-sequencing (scRNA-seq) and population genetics, we apply a framework in which to evaluate cell type-specific effects of genetic variation on gene expression.
Results
Here, we perform scRNA-seq on 64,018 fibroblasts from 79 donors and map expression quantitative trait loci (eQTLs) at the level of individual cell types. We demonstrate that the majority of eQTLs detected in fibroblasts are specific to an individual cell subtype. To address if the allelic effects on gene expression are maintained following cell reprogramming, we generate scRNA-seq data in 19,967 iPSCs from 31 reprogramed donor lines. We again identify highly cell type-specific eQTLs in iPSCs and show that the eQTLs in fibroblasts almost entirely disappear during reprogramming.
Conclusions
This work provides an atlas of how genetic variation influences gene expression across cell subtypes and provides evidence for patterns of genetic architecture that lead to cell type-specific eQTL effects.
Group heteroscedasticity is commonly observed in pseudo-bulk single-cell RNA-seq datasets and when not modelled appropriately, its presence can hamper the detection of differentially expressed genes.
Most bulk RNA-seq methods assume equal group variances which will under- and/or over-estimate the true variability in such datasets.
We present two methods that account for heteroscedastic groups, namely voomByGroup and voomWithQualityWeights using a blocked design (voomQWB).
Compared to current gold-standard methods that do not account for heteroscedasticity, we show results from simulation studies and various experiments that demonstrate the superior performance of both voomByGroup and voomQWB in error control and power when group variances in pseudo-bulk scRNA-seq data are unequal.
We recommend the use of either of these methods over established approaches, with voomByGroup having the advantage of accurate variance estimation since group variance trends can take on different "shapes", whilst voomQWB has the advantage of catering to complex study designs.
Group heteroscedasticity is commonly observed in pseudo-bulk single-cell RNA-seq datasets and its presence can hamper the detection of differentially expressed genes. Since most bulk RNA-seq methods assume equal group variances, we introduce two new approaches that account for heteroscedastic groups, namely voomByGroup and voomWithQualityWeights using a blocked design (voomQWB). Compared to current gold-standard methods that do not account for group heteroscedasticity, we show results from simulations and various experiments that demonstrate the superior performance of voomByGroup and voomQWB in terms of error control and power when group variances in pseudo-bulk single-cell RNA-seq data are unequal.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.