We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the United States Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed, for these and qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.
Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.
Fuchs endothelial corneal dystrophy (FECD) is a common, familial disease of the corneal endothelium and is the leading indication for corneal transplantation. Variation in the transcription factor 4 (TCF4) gene has been identified as a major contributor to the disease. We tested for an association between an intronic TGC trinucleotide repeat in TCF4 and FECD by determining repeat length in 66 affected participants with severe FECD and 63 participants with normal corneas in a 3-stage discovery/replication/validation study. PCR primers flanking the TGC repeat were used to amplify leukocyte-derived genomic DNA. Repeat length was determined by direct sequencing, short tandem repeat (STR) assay and Southern blotting. Genomic Southern blots were used to evaluate samples for which only a single allele was identified by STR analysis. Compiling data for 3 arms of the study, a TGC repeat length >50 was present in 79% of FECD cases and in 3% of normal controls cases (p<0.001). Among cases, 52 of 66 (79%) subjects had >50 TGC repeats, 13 (20%) had <40 repeats and 1 (2%) had an intermediate repeat length. In comparison, only 2 of 63 (3%) unaffected control subjects had >50 repeats, 60 (95%) had <40 repeats and 1 (2%) had an intermediate repeat length. The repeat length was greater than 1000 in 4 FECD cases. The sensitivity and specificity of >50 TGC repeats identifying FECD in this patient cohort was 79% and 96%, respectively Expanded TGC repeat was more specific for FECD cases than the previously identified, highly associated, single nucleotide polymorphism, rs613872 (specificity = 79%). The TGC trinucleotide repeat expansion in TCF4 is strongly associated with FECD, and a repeat length >50 is highly specific for the disease This association suggests that trinucleotide expansion may play a pathogenic role in the majority of FECD cases and is a predictor of disease risk.
Thiopurine methyltransferase (TPMT) catalyzes the S-methylation of thiopurine drugs. Individual variation in the toxicity and therapeutic efficacy of these drugs is associated with a common genetic polymorphism that controls levels of TPMT activity and immunoreactive protein in human tissues. Because of the clinical significance of the "pharmacogenetic" regulation of this enzyme, it would be important to clone the gene for TPMT in humans and to study the molecular basis for the genetic polymorphism. As a first step toward cloning the gene for TPMT, we used the rapid amplification of genomic DNA ends to obtain a TPMT-specific intron sequence. That DNA sequence was used to design primers for the polymerase chain reaction (PCR), which made it possible to determine that the active gene for TPMT is located on human chromosome 6. A TPMT-positive cosmid clone was then isolated from a human chromosome 6-specific genomic DNA library, and the gene was sublocalized to chromosome band 6p22.3 by fluorescence in situ hybridization. The gene for TPMT was found to be approximately 34 kb in length and consisted of 10 exons and 9 introns. On the basis of the results of 5'-rapid amplification of cDNA ends, transcription initiation occurred at or near a point 89 nucleotides upstream from the translation initiation codon of previously reported TPMT cDNAs. Once the structure of the TPMT gene had been determined, it was possible to perform the PCR with primers complementary to the sequences of introns flanking each exon that encodes enzyme protein with template DNA obtained from subjects with known phenotypes for the TPMT genetic polymorphism. This DNA was isolated from blood samples from 4 unrelated subjects with genetically low TPMT activity and 4 unrelated subjects with high TPMT activity. All subjects with low TPMT activity were homozygous for two point mutations--a G-->A transition at nucleotide 460 in exon 7 and an A-->G transition at nucleotide 719 in exon 10. Both mutations resulted in alterations in amino acid sequence, with Ala-154-->Thr and Tyr-240-->Cys, respectively. All DNA samples isolated from the blood of subjects with high TPMT activity contained "wild-type" sequence. Results obtained with these blood samples were confirmed when DNA from four human liver samples with high TPMT activity were found to have wild-type sequence at nucleotides 460 and 719, while three liver samples with intermediate enzyme activity (i.e., samples presumed to be heterozygous for the polymorphism) were heterozygous for the exon 7 and exon 10 mutations present in the blood samples of homozygous low subjects. Transient expression in COS-1 cells of TPMT expression constructs that contained both of the mutations in exons 7 and 10, as well as each independently, demonstrated that each mutation, as well as both together, resulted in decreased expression of TPMT enzymatic activity and immunoreactive protein. Molecular cloning and structural characterization of the TPMT gene as well as elucidation of the molecular basis for a common TPMT genetic poly...
Small intestine neuroendocrine tumors (SI-NETs) are the most common malignancy of the small bowel. Several clinical trials target PI3K/Akt/mTOR signaling; however, it is unknown whether these or other genes are genetically altered in these tumors. To address the underlying genetics, we analyzed 48 SI-NETs by massively parallel exome sequencing. We detected an average of 0.1 somatic single nucleotide variants (SNVs) per 10 6 nucleotides (range, 0-0.59), mostly transitions (C>T and A>G), which suggests that SI-NETs are stable cancers. 197 protein-altering somatic SNVs affected a preponderance of cancer genes, including FGFR2, MEN1, HOOK3, EZH2, MLF1, CARD11, VHL, NONO, and SMAD1. Integrative analysis of SNVs and somatic copy number variations identified recurrently altered mechanisms of carcinogenesis: chromatin remodeling, DNA damage, apoptosis, RAS signaling, and axon guidance. Candidate therapeutically relevant alterations were found in 35 patients, including SRC, SMAD family genes, AURKA, EGFR, HSP90, and PDGFR. Mutually exclusive amplification of AKT1 or AKT2 was the most common event in the 16 patients with alterations of PI3K/Akt/ mTOR signaling. We conclude that sequencing-based analysis may provide provisional grouping of SI-NETs by therapeutic targets or deregulated pathways. IntroductionSmall intestine neuroendocrine neoplasms (SI-NENs) are the most common malignancy of the small bowel, represent the largest group of NENs by organ site, and are studied in clinical treatment trials targeting PI3K/Akt/mTOR signaling. Whether this or other canonical cancer pathways is recurrently mutated, however, is uncertain, because a genome-wide, unbiased sequence analysis of cancer genes has not been performed to date in SI-NENs.Massively parallel, or "nextgen," DNA sequencing is currently advancing research in other human malignancies by facilitating the collection of comprehensive, genome-wide, unbiased datasets providing a common data framework for comparing results across different tumor types and gene sets. It provides the most comprehensive technology to date to explore the potential of genomics for individualizing cancer treatment within a tumor type. To unlock and explore the potential of the technology for translational research in SI-NEN, we sequenced 48 such tumors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.