Motivation Current fusion detection tools use diverse calling approaches and provide varying results, making selection of the appropriate tool challenging. Ensemble fusion calling techniques appear promising; however, current options have limited accessibility and function. Results MetaFusion is a flexible metacalling tool that amalgamates outputs from any number of fusion callers. Individual caller results are standardized by conversion into the new file type Common Fusion Format. Calls are annotated, merged using graph clustering, filtered and ranked to provide a final output of high-confidence candidates. MetaFusion consistently achieves higher precision and recall than individual callers on real and simulated datasets, and reaches up to 100% precision, indicating that ensemble calling is imperative for high-confidence results. MetaFusion uses FusionAnnotator to annotate calls with information from cancer fusion databases and is provided with a Benchmarking Toolkit to calibrate new callers. Availability and implementation MetaFusion is freely available at https://github.com/ccmbioinfo/MetaFusion. Supplementary information Supplementary data are available at Bioinformatics online.
MotivationGene fusions are often associated with cancer, yet current fusion detection tools vary in their calling approaches, making selecting the right tool challenging. Ensemble fusion calling techniques appear promising; however, current options have limited accessibility and function.ResultsMetaFusion is a flexible meta-calling tool that amalgamates the outputs from any number of fusion callers. Results from individual callers are converted into Common Fusion Format, a new file type that standardizes outputs from callers. Calls are then annotated, merged using graph clustering, filtered and ranked to provide a final output of high confidence candidates. MetaFusion consistently outperformed individual callers with respect to recall and precision on real and simulated datasets, achieving up to 100% precision. Thus, an ensemble calling approach is imperative for high confidence results. MetaFusion also labels fusions found in databases using the FusionAnnotator package, and is provided with a benchmarking toolkit to calibrate new callers.AvailabilityMetaFusion is freely available at https://github.com/ccmbioinfo/MetaFusionContactarun.ramani@sickkids.ca
A key challenge in the application of whole-genome sequencing (WGS) for clinical diagnostic and research is the high-throughput prioritization of functional variants in the non-coding genome. This challenge is compounded by context-specific genetic modulation of gene expression, and variant-gene mapping depends on the tissues and organ systems affected in a given disease; for instance, a disease affecting the gastrointestinal system would use maps specific to genome regulation in gut-related tissues. While there are large-scale atlases of genome regulation, such as GTEx and NIH Roadmap Epigenomics, the clinical genetics community lacks publicly-available stand-alone software for high-throughput annotation of custom variant data with user-defined tissue-specific epigenetic maps and clinical genetic databases, to prioritize variants for a specific biomedical application. In this work, we provide a simple software pipeline, called SNPnotes, which takes as input variant calls for a patient and prioritizes those using information on clinical relevance from ClinVar, tissue-specific gene regulation from GTEx and disease associations from the NHGRI-EBI GWAS catalogue. This pipeline was developed as part of SVAI Research's "Undiagnosed-1" event for collaborative patient diagnosis. We applied this pipeline to WGS-based variant calls for an individual with a history of gastrointestinal symptoms, using 12 gut-specific eQTL maps and GWAS associations for metabolic diseases, for variant-gene mapping. Out of 6,248,584 SNPs, the pipeline identified 151 high-priority variants, overlapping 129 genes. These top SNPs all have known clinical pathogenicity, modulate gene expression in gut tissues and have genetic associations with metabolic disorders, and serve as starting points for hypotheses about mechanisms driving clinical symptoms. Simple software changes can be made to customize the pipeline for other tissue-specific applications. Future extensions could integrate maps of tissue-specific regulatory elements, higher-order chromatin loops, and mutations affecting splice variants.
Recent advancements in high throughput sequencing analysis have enabled the characterization of cancerdriving fusions, improving our understanding of cancer development. Most fusion calling methods, however, examine either RNA or DNA information alone and are limited to a rigid definition of what constitutes a fusion. For this study we developed a pipeline that incorporates several fusion calling methods and considers both RNA and DNA to capture a more complete representation of the tumour fusion landscape. Interestingly, most of the fusions we identified were specific to RNA, with no evidence of corresponding genomic restructuring. Further, while the average total number of fusions in tumour and normal brain tissue samples is comparable, their overall fusion profiles vary significantly. Tumours have an over-representation of fusions occurring between coding genes, whereas fusions involving intergenic or non-coding regions comprised the vast majority of those in normals. Tumours were also more abundant in unique, samplespecific fusions compared to normals, though several fusions exhibited strong recurrence in the tumour type examined (diffuse intrinsic pontine glioma; DIPG) and were absent from both normal tissues and other cancers. Intriguingly, tumours also show broad up-or down-regulation of spliceosomal gene expression, which significantly correlates with fusion number (p=0.007). Our results show that RNA-specific fusions are abundant in both tumour and normal tissue and are associated with spliceosomal gene dysregulation.RNA-specific fusions should be considered as a potential mechanism that may contribute to cancer formation initiation and maintenance alongside more traditional structural events.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.