Chromatin accessibility identifies active regions of the genome, often at transcription factor (TF) binding sites, enhancers, and promoters, and contains regulatory genetic variation. Functionally related accessible sites have been reported to be co-accessible; however, the prevalence and range of co-accessibility is unknown. We perform ATAC-seq in induced pluripotent stem cells from 134 individuals and integrate it with RNA-seq, WGS, and ChIP-seq, providing the first longrange chromosome-length analysis of co-accessibility. We show that co-accessibility is highly connected, with sites having a median of 24 co-accessible partners up to 250Mb away. We also show that co-accessibility can de novo identify known and novel co-expressed genes, and coregulatory TFs and chromatin states. We perform a cis and trans-caQTL, a trans-eQTL, and examine allelic effects of co-accessibility, identifying tens of thousands of trans-caQTLs, and showing that trans genetic effects can be propagated through co-accessibility to gene expression for cell-type and disease relevant genes.Here, we perform ATAC-seq in 152 induced pluripotent stem cells (iPSCs) from 134 individuals from iPSCORE 20-23 , and integrate this data with available WGS and RNA-seq for the same individuals. We call over 1 million accessible chromatin sites and utilize population-level information to identify co-accessible sites by testing for correlation in accessibility between all sites chromosome-wide. We show co-accessibility is highly connected, with sites being coaccessible with an average of 24 other sites, and can span long distances (up to hundreds of megabases). We then use these significant relationships to create co-accessibility networks, and show that neighbors in these networks are enriched for TF co-binding partners, functionally related TFs, spatially colocalized loci (ie loci in a chromatin loop), and co-expressed genes up to 100Mb apart, and can also be used to infer novel TF functionality. Next, we examine the genetic architecture of co-accessibility by measuring allele specific effects (ASE) and performing one of the largest caQTLs studies to date. We show that genetic effects spread through coaccessibility, with highly connected sites being more likely to have a cis-caQTL or exhibit ASE; additionally, strong ASE explains 52% of co-accessible weaker ASE. Finally, we leverage these networks to identify more than 92,000 trans-caQTLs greater than 1.5Mb from their target, 9 of which are also trans-eQTLs for cell type and disease relevant genes. Overall, our data reveals that chromatin co-accessibility is highly connected, spans the length of entire chromosomes, can de novo identify co-regulatory TFs, is a mechanism underlying trans genetic effects, and can give insight into trans-eQTL mechanisms.
Results
Samples, ATAC-seq data generation, and ATAC peak characterizationTo measure chromatin co-accessibility, accessible sites were identified from ATAC-seq performed on 152 iPSC lines. These lines were generated from 134 individuals ( Supplementary Table 1...