Lung cancer is the leading cause of cancer-related mortality worldwide, with non-small-cell lung carcinomas in smokers being the predominant form of the disease. Although previous studies have identified important common somatic mutations in lung cancers, they have primarily focused on a limited set of genes and have thus provided a constrained view of the mutational spectrum. Recent cancer sequencing efforts have used next-generation sequencing technologies to provide a genome-wide view of mutations in leukaemia, breast cancer and cancer cell lines. Here we present the complete sequences of a primary lung tumour (60x coverage) and adjacent normal tissue (46x). Comparing the two genomes, we identify a wide variety of somatic variations, including >50,000 high-confidence single nucleotide variants. We validated 530 somatic single nucleotide variants in this tumour, including one in the KRAS proto-oncogene and 391 others in coding regions, as well as 43 large-scale structural variations. These constitute a large set of new somatic mutations and yield an estimated 17.7 per megabase genome-wide somatic mutation rate. Notably, we observe a distinct pattern of selection against mutations within expressed genes compared to non-expressed genes and in promoter regions up to 5 kilobases upstream of all protein-coding genes. Furthermore, we observe a higher rate of amino acid-changing mutations in kinase genes. We present a comprehensive view of somatic alterations in a single lung tumour, and provide the first evidence, to our knowledge, of distinct selective pressures present within the tumour environment.
Hepatitis B virus (HBV) infection is a leading risk factor for hepatocellular carcinoma (HCC). HBV integration into the host genome has been reported, but its scale, impact and contribution to HCC development is not clear. Here, we sequenced the tumor and nontumor genomes (>803 coverage) and transcriptomes of four HCC patients and identified 255 HBV integration sites. Increased sequencing to 2403 coverage revealed a proportionally higher number of integration sites. Clonal expansion of HBV-integrated hepatocytes was found specifically in tumor samples. We observe a diverse collection of genomic perturbations near viral integration sites, including direct gene disruption, viral promoterdriven human transcription, viral-human transcript fusion, and DNA copy number alteration. Thus, we report the most comprehensive characterization of HBV integration in hepatocellular carcinoma patients. Such widespread random viral integration will likely increase carcinogenic opportunities in HBV-infected individuals.[Supplemental material is available for this article.] . HBV integration into the host genome has been reported both in tumors (Gozuacik et al. 2001;Murakami et al. 2005;Saigo et al. 2008) and in nontumor liver tissue from HBV-infected individuals (Mason et al. 2010), although such integration is not essential for HBV replication. The relative extent, mutation model, and the functional impact of HBV integration in host genomes is not clear due to the lack of an unbiased approach to identify and quantify genome-wide HBV integration sites. Recent advances in sequencing technologies (Meyerson et al. 2010) provide an opportunity to investigate the global extent, mutation model, and functional impact of viral integration in the host genome. Recently, a primary hepatitis C virus-infected HCC patient has been subjected to whole-genome sequencing, and many somatic mutations were reported (Totoki et al. 2011). However, as an RNA virus, HCV never integrates into the host genome during its life cycle; therefore, liver cancer with HCV infection is not an optimal model to study viral-human genomic interactions. To that end, sequencing the genome and transcriptome of an HBV-positive HCC patient provides a great opportunity to reveal the functional impact of viral integration on the host genome.
Recent advances in whole genome sequencing have brought the vision of personal genomics and genomic medicine closer to reality. However, current methods lack clinical accuracy and the ability to describe the context (haplotypes) in which genome variants co-occur in a cost-effective manner. Here we describe a low-cost DNA sequencing and haplotyping process, Long Fragment Read (LFR) technology, similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes. In this study, ten LFR libraries were made using only ~100 pg of human DNA per sample. Up to 97% of the heterozygous single nucleotide variants (SNVs) were assembled into long haplotype contigs. Removal of false positive SNVs not phased by multiple LFR haplotypes resulted in a final genome error rate of 1 in 10 Mb. Cost-effective and accurate genome sequencing and haplotyping from 10-20 human cells, as demonstrated here, will enable comprehensive genetic studies and diverse clinical applications.
PurposeWe investigated the frequencies and characteristics of intragenic copy-number variants (CNVs) in a deep sampling of disease genes associated with monogenic disorders.MethodsSubsets of 1507 genes were tested using next-generation sequencing to simultaneously detect sequence variants and CNVs in >143,000 individuals referred for genetic testing. We analyzed CNVs in gene panels for hereditary cancer syndromes and cardiovascular, neurological, or pediatric disorders.ResultsOur analysis identified 2844 intragenic CNVs in 384 clinically tested genes. CNVs were observed in 1.9% of the entire cohort but in a disproportionately high fraction (9.8%) of individuals with a clinically significant result. CNVs accounted for 4.7–35% of pathogenic variants, depending on clinical specialty. Distinct patterns existed among CNVs in terms of copy number, location, exons affected, clinical classification, and genes affected. Separately, analysis of de-identified data for 599 genes unrelated to the clinical phenotype yielded 4054 CNVs. Most of these CNVs were novel rare events, present as duplications, and enriched in genes associated with recessive disorders or lacking loss-of-function mutational mechanisms.ConclusionUniversal intragenic CNV analysis adds substantial clinical sensitivity to genetic testing. Clinically relevant CNVs have distinct properties that distinguish them from CNVs contributing to normal variation in human disease genes.
Common single-nucleotide polymorphisms (SNPs) at nicotinic acetylcholine receptor (nAChR) subunit genes have previously been associated with measures of nicotine dependence. We investigated the contribution of common SNPs and rare single-nucleotide variants (SNVs) in nAChR genes to Fagerström test for nicotine dependence (FTND) scores in treatment-seeking smokers. Exons of 10 genes were resequenced with next-generation sequencing technology in 448 European-American participants of a smoking cessation trial, and CHRNB2 and CHRNA4 were resequenced by Sanger technology to improve sequence coverage. A total of 214 SNP/SNVs were identified, of which 19.2% were excluded from analyses because of reduced completion rate, 73.9% had minor allele frequencies <5%, and 48.1% were novel relative to dbSNP build 129. We tested associations of 173 SNP/SNVs with the FTND score using data obtained from 430 individuals (18 were excluded because of reduced completion rate) using linear regression for common, the cohort allelic sum test and the weighted sum statistic for rare, and the multivariate distance matrix regression method for both common and rare SNP/SNVs. Association testing with common SNPs with adjustment for correlated tests within each gene identified a significant association with two CHRNB2 SNPs, eg, the minor allele of rs2072660 increased the mean FTND score by 0.6 Units (P=0.01). We observed a significant evidence for association with the FTND score of common and rare SNP/SNVs at CHRNA5 and CHRNB2, and of rare SNVs at CHRNA4. Both common and/or rare SNP/SNVs from multiple nAChR subunit genes are associated with the FTND score in this sample of treatment-seeking smokers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.