2018
DOI: 10.1101/279323
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A general framework for predicting the transcriptomic consequences of non-coding variation

Abstract: 17Genome wide association studies (GWASs) for complex traits have implicated thousands of genetic 18 loci. Most GWAS-nominated variants lie in noncoding regions, complicating the systematic translation 19 of these findings into functional understanding. Here, we leverage convolutional neural networks to 20 assist in this challenge. Our computational framework, peaBrain, models the transcriptional machinery 21 of a tissue as a two-stage process: first, predicting the mean tissue specific abundance of all genes … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 43 publications
0
2
0
Order By: Relevance
“…Next, we compared our best baseline and Xpresso models with the reported results from existing models described in the literature, delineating five categories based upon the types of features used either as input data or as intermediate training stages: (1) those using nothing more than sequence features, which included our method and three others (Abdalla et al, 2018;Bessiè re et al, 2018;McLeay et al, 2012); (2) those using MPRAs to measure promoter activity (van Arensbergen et al, 2017;Cooper et al, 2006;Landolin et al, 2010;Nguyen et al, 2016); (3) those using the binding signal of TFs at promoter regions, as measured by ChIP (Cheng et al, 2011(Cheng et al, , 2012McLeay et al, 2012;Ouyang et al, 2009;Zhou et al, 2018); (4) those using the signal of histone marks, such as H3K4me1, H3K4me3, H3K9me3, H3K27Ac, H3K27me3, and H3K36me3 at promoters and gene bodies, as measured by ChIP (Abdalla et al, 2018;Cheng et al, 2011;Dong et al, 2012;Karli c et al, 2010;McLeay et al, 2012;Schmidt et al, 2017;Zhou et al, 2018); and (5) those using the DNase hypersensitivity signal at promoters and nearby enhancers (Dong et al, 2012;Duren et al, 2017;McLeay et al, 2012;Schmidt et al, 2017;Zhou et al, 2018) (Figure 4E). Many of these models were trained and tested on cell lines, such as K562, GM12878, and mESCs, for which ChIP data are available for a multitude of histone marks and TFs.…”
Section: Cellmentioning
confidence: 99%
“…Next, we compared our best baseline and Xpresso models with the reported results from existing models described in the literature, delineating five categories based upon the types of features used either as input data or as intermediate training stages: (1) those using nothing more than sequence features, which included our method and three others (Abdalla et al, 2018;Bessiè re et al, 2018;McLeay et al, 2012); (2) those using MPRAs to measure promoter activity (van Arensbergen et al, 2017;Cooper et al, 2006;Landolin et al, 2010;Nguyen et al, 2016); (3) those using the binding signal of TFs at promoter regions, as measured by ChIP (Cheng et al, 2011(Cheng et al, , 2012McLeay et al, 2012;Ouyang et al, 2009;Zhou et al, 2018); (4) those using the signal of histone marks, such as H3K4me1, H3K4me3, H3K9me3, H3K27Ac, H3K27me3, and H3K36me3 at promoters and gene bodies, as measured by ChIP (Abdalla et al, 2018;Cheng et al, 2011;Dong et al, 2012;Karli c et al, 2010;McLeay et al, 2012;Schmidt et al, 2017;Zhou et al, 2018); and (5) those using the DNase hypersensitivity signal at promoters and nearby enhancers (Dong et al, 2012;Duren et al, 2017;McLeay et al, 2012;Schmidt et al, 2017;Zhou et al, 2018) (Figure 4E). Many of these models were trained and tested on cell lines, such as K562, GM12878, and mESCs, for which ChIP data are available for a multitude of histone marks and TFs.…”
Section: Cellmentioning
confidence: 99%
“…Next, we compared our best baseline and Xpresso models to existing models described in the literature, categorizing the types of features used as input in each model into five categories: i) those using nothing more than sequence features, which included our method and two others 14,42 , ii) those using MPRAs to measure promoter activity 23,[43][44][45] , iii) those utilizing the binding signal of transcription factors (TFs) at promoter regions, as measured by ChIP 11,13-15 , iv) those utilizing the signal of histone marks such as H3K4me1, H3K4me3, H3K9me3, H3K27Ac, H3K27me3, and H3K36me3 at promoters and gene bodies, as measured by ChIP 11,12,14,16,17,42 , and v) those utilizing DNase hypersensitivity signal at promoters and nearby enhancers 12,14,16,46 ( Figure 4E ).…”
Section: Performance Of Cell Type-specific Xpresso Modelsmentioning
confidence: 99%