Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of transcript abundance. Fine-tuning of read alignment algorithms does not correct this problem. We have developed Seqnature software to construct individualized diploid genomes and transcriptomes for multiparent populations and have implemented a complete analysis pipeline that incorporates other existing software tools. We demonstrate in simulated and real data sets that alignment to individualized transcriptomes increases read mapping accuracy, improves estimation of transcript abundance, and enables the direct estimation of allele-specific expression. Moreover, when applied to expression QTL mapping we find that our individualized alignment strategy corrects false-positive linkage signals and unmasks hidden associations. We recommend the use of individualized diploid genomes over reference sequence alignment for all applications of high-throughput sequencing technology in genetically diverse populations.
RNA editing is a process that modifies RNA nucleotides and changes the efficiency and fidelity of the central dogma. Enzymes that catalyze RNA editing are required for life, and defects in RNA editing are associated with many diseases. Recent advances in sequencing have enabled the genome-wide identification of RNA editing sites in mammalian transcriptomes. Here, we demonstrate that canonical RNA editing (A-to-I and C-to-U) occurs in liver, white adipose, and bone tissues of the laboratory mouse, and we show that apparent non-canonical editing (all other possible base substitutions) is an artifact of current high-throughput sequencing technology. Further, we report that high-confidence canonical RNA editing sites can cause non-synonymous amino acid changes and are significantly enriched in 3′ UTRs, specifically at microRNA target sites, suggesting both regulatory and functional consequences for RNA editing.
Background The continued development of targeted therapeutics for cancer treatment has required the concomitant development of more expansive methods for the molecular profiling of the patient’s tumor. We describe the validation of the JAX Cancer Treatment Profile™ (JAX-CTP™), a next generation sequencing (NGS)-based molecular diagnostic assay that detects actionable mutations in solid tumors to inform the selection of targeted therapeutics for cancer treatment. Methods NGS libraries are generated from DNA extracted from formalin fixed paraffin embedded tumors. Using hybrid capture, the genes of interest are enriched and sequenced on the Illumina HiSeq 2500 or MiSeq sequencers followed by variant detection and functional and clinical annotation for the generation of a clinical report. Results The JAX-CTP™ detects actionable variants, in the form of single nucleotide variations and small insertions and deletions (≤50bp) in 190 genes in specimens with a neoplastic cell content of ≥10%. The JAX-CTP™ is also validated for the detection of clinically actionable gene amplifications. Conclusions There is a lack of consensus in the molecular diagnostics field on the best method for the validation of NGS-based assays in oncology, thus the importance of communicating methods, as contained in this report. The growing number of targeted therapeutics and the complexity of the tumor genome necessitates continued development and refinement of advanced assays for tumor profiling to enable precision cancer treatment.
Self-renewal, the ability of a stem cell to divide repeatedly while maintaining an undifferentiated state, is a defining characteristic of all stem cells. Here, we clarify the molecular foundations of mouse embryonic stem cell (mESC) self-renewal by applying a proven Bayesian network machine learning approach to integrate high-throughput data for protein function discovery. By focusing on a single stem-cell system, at a specific developmental stage, within the context of well-defined biological processes known to be active in that cell type, we produce a consensus predictive network that reflects biological reality more closely than those made by prior efforts using more generalized, context-independent methods. In addition, we show how machine learning efforts may be misled if the tissue specific role of mammalian proteins is not defined in the training set and circumscribed in the evidential data. For this study, we assembled an extensive compendium of mESC data: ∼2.2 million data points, collected from 60 different studies, under 992 conditions. We then integrated these data into a consensus mESC functional relationship network focused on biological processes associated with embryonic stem cell self-renewal and cell fate determination. Computational evaluations, literature validation, and analyses of predicted functional linkages show that our results are highly accurate and biologically relevant. Our mESC network predicts many novel players involved in self-renewal and serves as the foundation for future pluripotent stem cell studies. This network can be used by stem cell researchers (at http://StemSight.org) to explore hypotheses about gene function in the context of self-renewal and to prioritize genes of interest for experimental validation.
Embryonic stem cells (ESCs), characterized by their ability to both self-renew and differentiate into multiple cell lineages, are a powerful model for biomedical research and developmental biology. Human and mouse ESCs share many features, yet have distinctive aspects, including fundamental differences in the signaling pathways and cell cycle controls that support self-renewal. Here, we explore the molecular basis of human ESC self-renewal using Bayesian network machine learning to integrate cell-type-specific, high-throughput data for gene function discovery. We integrated high-throughput ESC data from 83 human studies (~1.8 million data points collected under 1100 conditions) and 62 mouse studies (~2.4 million data points collected under 1085 conditions) into separate human and mouse predictive networks focused on ESC self-renewal to analyze shared and distinct functional relationships among protein-coding gene orthologs. Computational evaluations show that these networks are highly accurate, literature validation confirms their biological relevance, and RT-PCR validation supports our predictions. Our results reflect the importance of key regulatory genes known to be strongly associated with self-renewal and pluripotency in both species (e.g. POU5F1, SOX2, and NANOG), identify metabolic differences between species (e.g. threonine metabolism), clarify differences between human and mouse ESC developmental signaling pathways (e.g. LIF-activated JAK/STAT in mouse; NODAL/ACTIVIN-A-activated FGF in human), and reveal many novel genes and pathways predicted to be functionally associated with self-renewal in each species. These interactive networks are available online at www.StemSight.org for stem cell researchers to develop new hypotheses, discover potential mechanisms involving sparsely annotated genes, and prioritize genes of interest for experimental validation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.