2019
DOI: 10.1101/800060
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predicting 3D genome folding from DNA sequence

Abstract: In interphase, the human genome sequence folds in three dimensions into a rich variety of locus-specific contact patterns. Here we present a deep convolutional neural network, Akita, that accurately predicts genome folding from DNA sequence alone. Representations learned by Akita underscore the importance of CTCF and reveal a complex grammar underlying genome folding. Akita enables rapid in silico predictions for sequence mutagenesis, genome folding across species, and genetic variants. Main textRecent researc… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 47 publications
0
8
0
Order By: Relevance
“…Prediction accuracy improved more for CAGE gene expression measurements than accessibility or ChIP-seq, which suggests that multiple genome training will be worthwhile for data with high regulatory complexity and distal interactions. Efforts to predict spatial contacts between chromosomes as mapped by Hi-C and its relatives fit this criteria, and we hypothesize that training sequence-based models on human and mouse data together will be fruitful [ 47 ].…”
Section: Discussionmentioning
confidence: 99%
“…Prediction accuracy improved more for CAGE gene expression measurements than accessibility or ChIP-seq, which suggests that multiple genome training will be worthwhile for data with high regulatory complexity and distal interactions. Efforts to predict spatial contacts between chromosomes as mapped by Hi-C and its relatives fit this criteria, and we hypothesize that training sequence-based models on human and mouse data together will be fruitful [ 47 ].…”
Section: Discussionmentioning
confidence: 99%
“…Specific TEs have previously been shown to be enriched within LADs, and LADs are postulated to be a key element in multiple mechanisms that many species have evolved to silence and limit accessibility to highly repetitive DNA and other TEs (Hollister and Gaut, 2009;Meuleman et al, 2013) . However, TEs have also been implicated in various genomic regulatory functions and are linked to species-specific transcription-factor binding sites in mammals (Bourque et al, 2008;Kunarso et al, 2010;Wang et al, 2007) , as well as other speciation events via their impact on 3D genome folding (Choudhary et al, 2020;Fudenberg et al, 2019) . TEs are further implicated in the genesis and regulation of long noncoding RNAs (Kapusta et al, 2013;Kelley and Rinn, 2012) .…”
Section: T1-and T2-lads Have Distinct Genomic Featuresmentioning
confidence: 99%
“…6). 18 , as covariates for models of more complicated regulatory events such as enhancer-promoter looping 19 , and in high-resolution association mapping of animal models as it becomes widespread 20 . Thus, DeepArk will contribute to a number of diverse experimental and computational analyses, both directly through its predictions or as part of larger computational pipelines.…”
Section: Mainmentioning
confidence: 99%