RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-0881-8) contains supplementary material, which is available to authorized users.
Primary triple negative breast cancers (TNBC) represent approximately 16% of all breast cancers1 and are a tumour type defined by exclusion, for which comprehensive landscapes of somatic mutation have not been determined. Here we show in 104 early TNBC cases, that at the time of diagnosis these cancers exhibit a wide and continuous spectrum of genomic evolution, with some exhibiting only a handful of somatic aberrations in a few pathways, whereas others contain hundreds of somatic events and multiple pathways implicated. Integration with matched whole transcriptome sequence data revealed that only ~36% of mutations are expressed. By examining single nucleotide variant (SNV) allelic abundance derived from deep re-sequencing (median >20,000 fold) measurements in 2414 somatic mutations, we determine for the first time in an epithelial tumour, the relative abundance of clonal genotypes among cases in the population. We show that TNBC vary widely and continuously in their clonal frequencies at the time of diagnosis, with basal subtype TNBC2,3 exhibiting more variation than non-basal TNBC. Although p53 and PIK3CA/PTEN somatic mutations appear clonally dominant compared with other pathways, in some tumours their clonal frequencies are incompatible with founder status. Mutations in cytoskeletal and cell shape/motility proteins occurred at lower clonal frequencies, suggesting they occurred later during tumour progression. Taken together our results show that future attempts to dissect the biology and therapeutic responses of TNBC will require the determination of individual tumour clonal genotypes.
BACKGROUND-Ovarian clear-cell and endometrioid carcinomas may arise from endometriosis, but the molecular events involved in this transformation have not been described.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.