BackgroundLung adenocarcinoma is the most common type of lung cancers. Whole-genome sequencing studies disclosed the genomic landscape of lung adenocarcinomas. however, it remains unclear if the genetic alternations could guide prognosis prediction. Effective genetic markers and their based prediction models are also at a lack for prognosis evaluation.MethodsWe obtained the somatic mutation data and clinical data for 371 lung adenocarcinoma cases from The Cancer Genome Atlas. The cases were classified into two prognostic groups (3-year survival), and a comparison was performed between the groups for the somatic mutation frequencies of genes, followed by development of computational models to discrete the different prognosis.ResultsGenes were found with higher mutation rates in good (≥ 3-year survival) than in poor (< 3-year survival) prognosis group of lung adenocarcinoma patients. Genes participating in cell-cell adhesion and motility were significantly enriched in the top gene list with mutation rate difference between the good and poor prognosis group. Support Vector Machine models with the gene somatic mutation features could well predict prognosis, and the performance improved as feature size increased. An 85-gene model reached an average cross-validated accuracy of 81% and an Area Under the Curve (AUC) of 0.896 for the Receiver Operating Characteristic (ROC) curves. The model also exhibited good inter-stage prognosis prediction performance, with an average AUC of 0.846 for the ROC curves.ConclusionThe prognosis of lung adenocarcinomas is related with somatic gene mutations. The genetic markers could be used for prognosis prediction and furthermore provide guidance for personal medicine.Electronic supplementary materialThe online version of this article (10.1186/s12885-019-5433-7) contains supplementary material, which is available to authorized users.
Human genes form a large variety of isoforms after transcription, encoding distinct transcripts to exert different functions. Single-molecule RNA sequencing facilitates accurate identification of the isoforms by extending nucleotide read length significantly. However, the gene or isoform diversity is lowly represented by the mRNA molecules captured by singlemolecule RNA sequencing. Here, we show that a cDNA normalization procedure before the library preparation for PacBio RS II sequencing captures 3.2-6.0 fold more full-length highquality isoform species for different human samples, as compared to the non-normalized capture procedure. Many lowly expressed, functionally important isoforms can be detected. In addition, normalized PacBio RNA sequencing also resolves more allele-specific haplotype transcripts. Finally, we apply the cDNA normalization based long-read RNA sequencing method to profile the transcriptome of human gastric signet-ring cell carcinomas, identify new cancer-specific transcriptome signatures, and thus, bring out the utility of the improved protocols in gene expression studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.