2018
DOI: 10.1101/393926
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deciphering regulatory DNA sequences and noncoding genetic variants using neural network models of massively parallel reporter assays

Abstract: The relationship between noncoding DNA sequence and gene expression is not well-understood. Massively parallel reporter assays (MPRAs), which quantify the regulatory activity of large libraries of DNA sequences in parallel, are a powerful approach to characterize this relationship. We present MPRA-DragoNN, a convolutional neural network (CNN)-based framework to predict and interpret the regulatory activity of DNA sequences as measured by MPRAs. While our method is generally applicable to a variety of MPRA desi… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 67 publications
0
8
0
Order By: Relevance
“…86 Massively parallel assays have also been used in the context of both splicing 88,89 and miRNA-mediated regulation, 90,91 showing, for instance, that up to 16% of splice disrupting variants are located in deep intronic regions. 88 With the development of deep learning frameworks for sequence-based predictions, 92,93 data generated by these assays, will now fuel the construction of predictive models. These models will, in turn, allow quantifying the regulatory impact of novel genetic variants on transcriptional activity, 86 alternative splicing, 94 or miRNA-mediated regulation.…”
Section: Deciphering the Regulatory Code To Predict The Effect Of Rmentioning
confidence: 99%
“…86 Massively parallel assays have also been used in the context of both splicing 88,89 and miRNA-mediated regulation, 90,91 showing, for instance, that up to 16% of splice disrupting variants are located in deep intronic regions. 88 With the development of deep learning frameworks for sequence-based predictions, 92,93 data generated by these assays, will now fuel the construction of predictive models. These models will, in turn, allow quantifying the regulatory impact of novel genetic variants on transcriptional activity, 86 alternative splicing, 94 or miRNA-mediated regulation.…”
Section: Deciphering the Regulatory Code To Predict The Effect Of Rmentioning
confidence: 99%
“…Taking inspiration from work in the field of image recognition and genomics 26,[29][30][31] , we investigated the first convolutional layer to see which features our model deemed important by interpreting the filter weights learned from input sequences as sequence logos (Fig. 2E).…”
Section: Improved Interpretability Of Convolutional Neural Network Prmentioning
confidence: 99%
“…Moreover, ATAC-seq peaks are characterized by 1 or more summits, which have higher chromatin accessibility than flanking genomic regions also within ATAC-seq peaks 6 . Few studies have explored the relationship between transcriptional activity and open chromatin regions defined by ATAC-seq 7, 8 . For instance, it is not known whether higher chromatin accessibility within ATAC-seq peaks also corresponds to enhanced transcriptional activity, or whether proximity to ATAC-seq peak summits influences the likelihood for a SNP to affect enhancer activity.…”
Section: Introductionmentioning
confidence: 99%