2019
DOI: 10.1186/s13059-019-1653-z
|View full text |Cite
|
Sign up to set email alerts
|

MMSplice: modular modeling improves the predictions of genetic variant effects on splicing

Abstract: Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI5 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity, … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
216
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 186 publications
(226 citation statements)
references
References 57 publications
0
216
0
Order By: Relevance
“…The group 3 (ranked 5 th) did not provide implementation details. On the other side, the group 5 made the best predictions by using their developed MMSplice method (Cheng et al, ). In their method, six deep neural networks have been trained to extract features of splice donor, splice acceptor, 5′ exon, 3′ exon, 5′intron, and 3′ intron, which were later combined by a simple linear regression to predict ΔΨ.…”
Section: Discussionmentioning
confidence: 99%
“…The group 3 (ranked 5 th) did not provide implementation details. On the other side, the group 5 made the best predictions by using their developed MMSplice method (Cheng et al, ). In their method, six deep neural networks have been trained to extract features of splice donor, splice acceptor, 5′ exon, 3′ exon, 5′intron, and 3′ intron, which were later combined by a simple linear regression to predict ΔΨ.…”
Section: Discussionmentioning
confidence: 99%
“…It is probably difficult to train a model capturing much of the splicing regulatory elements directly from these data. Therefore, we used complementary data from different sources that are richer (Cheng et al, ). We used the GENCODE 24 annotation to train a module to score donor sites and similarly a module to score acceptor sites.…”
Section: Methodsmentioning
confidence: 99%
“…Although the two challenges have different measured quantities, we assumed that variant disrupting splicing could affect both Ψ and splicing efficiency. Therefore, we applied a modular modeling approach, MMSplice (Cheng et al, ), where the modules score different gene regions and are shared across challenges. The predictors proposed for each challenge differed only in how they combine the scores of the individual modules.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…While other aspects of their model could account for this superior performance, one unique feature of their model (MMsplice;Cheng et al, 2019) is the decomposition of sequence surrounding alternatively spliced exons into five distinct regions (upstream intron, acceptor site, exon, donor site, and downstream intron), each of which was evaluated by a distinct neural network. While other aspects of their model could account for this superior performance, one unique feature of their model (MMsplice;Cheng et al, 2019) is the decomposition of sequence surrounding alternatively spliced exons into five distinct regions (upstream intron, acceptor site, exon, donor site, and downstream intron), each of which was evaluated by a distinct neural network.…”
mentioning
confidence: 99%