Summary: Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous works have attempted to combine different methods, but they still suffer from poor accuracy particularly for insertions. We propose MetaSV, an integrated SV caller which leverages multiple orthogonal SV signals for high accuracy and resolution. MetaSV proceeds by merging SVs from multiple tools for all types of SVs. It also analyzes soft-clipped reads from alignment to detect insertions accurately since existing tools underestimate insertion SVs. Local assembly in combination with dynamic programming is used to improve breakpoint resolution. Paired-end and coverage information is used to predict SV genotypes. Using simulation and experimental data, we demonstrate the effectiveness of MetaSV across various SV types and sizes.Availability and implementation: Code in Python is at http://bioinform.github.io/metasv/.Contact: rd@bina.comSupplementary information: Supplementary data are available at Bioinformatics online.
SUMMARY Although uridine-rich small nuclear RNAs (U-snRNAs) are essential for pre-mRNA splicing, little is known regarding their function in the regulation of alternative splicing or of the biological consequences of their dysfunction in mammals. Here, we demonstrate that mutation of Rnu2–8, one of the mouse multicopy U2 snRNA genes, causes ataxia and neurodegeneration. Coincident with the observed pathology, the level of mutant U2 RNAs was highest in the cerebellum and increased after granule neuron maturation. Furthermore, neuron loss was strongly dependent on the dosage of mutant and wild type snRNA genes. Comprehensive transcriptome analysis identified a group of alternative splicing events, including the splicing of small introns, which were disrupted in the mutant cerebellum. Our results suggest that the expression of mammalian U2 snRNA genes, previously presumed to be ubiquitious, is spatially and temporally regulated, and dysfunction of a single U2 snRNA causes neuron degeneration through distortion of pre-mRNA splicing.
SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0758-2) contains supplementary material, which is available to authorized users.
Summary: VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously reported mutations to make the synthetic genomes biologically relevant. VarSim simulates and validates a wide range of variants, including single nucleotide variants, small indels and large structural variants. It is an automated, comprehensive compute framework supporting parallel computation and multiple read simulators. Furthermore, we developed a novel map data structure to validate read alignments, a strategy to compare variants binned in size ranges and a lightweight, interactive, graphical report to visualize validation results with detailed statistics. Thus far, it is the most comprehensive validation tool for secondary analysis in next generation sequencing.Availability and implementation: Code in Java and Python along with instructions to download the reads and variants is at http://bioinform.github.io/varsim.Contact: rd@bina.comSupplementary information: Supplementary data are available at Bioinformatics online.
whwong@stanford.edu.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.