2022
DOI: 10.1002/tpg2.20249
|View full text |Cite
|
Sign up to set email alerts
|

Modeling chromatin state from sequence across angiosperms using recurrent convolutional neural networks

Abstract: Accessible chromatin regions are critical components of gene regulation but modeling them directly from sequence remains challenging, especially within plants, whose mechanisms of chromatin remodeling are less understood than in animals. We trained an existing deep learning architecture, DanQ, on leaf ATAC-seq data from 12 angiosperm species to predict the chromatin accessibility of sequence windows within and across species. We also trained DanQ on DNA methylation data from 10 angiosperms, because unmethylate… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

3
3

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 101 publications
(125 reference statements)
0
4
0
Order By: Relevance
“…DanQ is a hybrid convolutional and recurrent neural network specifically designed for predicting the function of DNA sequences. It has demonstrated impressive performance in predicting chromatin states in plant species, making it a suitable choice for our comparative analysis 49 . For each task, the CNN+LSTM model was trained from scratch using one-hot encoded DNA sequences as input.…”
Section: Methodsmentioning
confidence: 99%
“…DanQ is a hybrid convolutional and recurrent neural network specifically designed for predicting the function of DNA sequences. It has demonstrated impressive performance in predicting chromatin states in plant species, making it a suitable choice for our comparative analysis 49 . For each task, the CNN+LSTM model was trained from scratch using one-hot encoded DNA sequences as input.…”
Section: Methodsmentioning
confidence: 99%
“…Though tools like AlphaFold2 (Jumper et al, 2021) have dramatically improved our ability to study coding sequence, similarly performing tools do not yet exist for non-coding regions. Nevertheless, over the last decade deep learning models have rapidly improved performance in predicting non-coding genomic features such as chromatin accessibility (Kelley, 2020; Wrightsman et al, 2022), transcription factor binding (Žiga Avsec, Weilert, et al, 2021; Mejía-Guerra & Buckler, 2019), and RNA abundance (Žiga Avsec, Agarwal, et al, 2021; Linder et al, 2023) directly from DNA sequence. These models can then be queried to highlight functional non-coding sites, which can be useful for filtering large sets of variants down to promising genome editing targets.…”
Section: Introductionmentioning
confidence: 99%
“…Generally, deep learning refers to computational methods that aim to learn a hierarchical representation of data by functionally relating the data in many layers [24]. In plant data, these models have learned to predict chromatin state from sequence [25], identify and classify key stress-responsive genes [26], and detect seasonal changes across fields [27]. In addition to these successes, many state-of-the-art models take in not only numerical data but also the structure of the relationships between the data.…”
Section: Introductionmentioning
confidence: 99%