Most genetic risk for human diseases lies within non-coding regions of the genome, which is predicted to regulate gene expression, often in a tissue and stage specific manner. This has motivated building of extensive eQTL resources to understand how human allelic variation affects gene expression and splicing throughout the body, focusing primarily on adult tissue.Given the importance of regulatory pathways during brain development, we characterize the genetic control of the developing human cerebral cortical transcriptome, including expression and splicing, in 201 mid-gestational human brains, to understand how common allelic variation affects gene regulation during development. We leverage expression and splice quantitative trait loci to identify genes and isoforms relevant to neuropsychiatric disorders and brain volume.These findings demonstrate genetic mechanisms by which early developmental events have a striking and widespread influence on adult anatomical and behavioral phenotypes, as well as the evolution of the human cerebral cortex.
Highlights• Genome wide map of human fetal brain eQTLs and sQTLs provides a new view of genetic control of expression and splicing.• There is substantial contrast between genetic control of transcript regulation in mature versus developing brain.• We identify novel regulatory regions specific to fetal brain development.• Integration of eQTLs and GWAS reveals specific relationships between expression and disease risk for neuropsychiatric diseases and relevant human brain phenotypes.
ResultsTo identify genetic variants regulating gene expression in the developing brain, we performed high-throughput RNA sequencing and high-density genotyping at 2.5 million sites in a set of 233 fetal brains (Figure 1). After quality control and normalization of gene expression quantifications and genotype imputation into the 1000 Genomes Project phase 3 multi-ethnic reference panel (Methods, Figure S1; (Genomes Project et al., 2015), we obtained a starting dataset of 15,925 expressed genes (12,943 protein coding and 767 long noncoding RNAs) and 6.6 million autosomal single nucleotide polymorphisms (SNPs) from each individual. PCA-based (principle component analysis) analysis of ancestry (Methods) indicate that the donors in our study come from admixed ancestries of Mexican, European, African American, and Chinese descent ( Figure S2). The resulting dataset is the first population level fetal brain expression dataset.
Robust identification of fetal brain cis-eQTLsWe identified cis-eQTLs by testing all SNPs within a 1MB window from the transcription start site (TSS) of each gene using a permutation procedure implemented in FastQTL (Ongen, Buil, Brown, Dermitzakis, & Delaneau, 2016) while adjusting for known (RIN, sex, age, and genotype PCs) and inferred covariates (Methods, Figure S3), which have been shown to greatly increase sensitivity for cis-eQTL detection (H. M. Kang et al., 2008;Leek & Storey, 2007;Mostafavi et al., 2013). We identified 6,546 genes with a cis-eQTL at a 5% false discovery r...