Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI 2018 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, alternative donor and acceptor sites, splicing efficiency, and pathogenicity, with matched or higher performance than state-of-the-art. Our models, available in the repository Kipoi, apply to variants including indels directly from VCF files.
Recently, de novo peptide sequencing has made it possible to gain new insights into the human immunopeptidome without relying on peptide databases, while identifying peptides of unknown origin. Many recent studies have attributed post-translational proteasomal splicing as the origin of those peptides. Here, we describe a peptide source assignment workflow to rigorously assign the source of de novo sequenced peptides and find that the estimated extent of post-translational splicing in the immunopeptidome is much lower than previously reported. Our approach demonstrates that many peptides that were thought to be post-translationally spliced are likely linear peptides, and many peptides that were thought to be trans-spliced could be cis-spliced. We believe our approach furthers the understanding of post-translationally spliced peptides and thus improves the characterization of immunopeptidome which plays a critical role in the immune response to antigens in cancer, autoimmune disease, and infections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.