Background
Identification of germline variation and somatic mutations is a major issue in human genetics. However, due to the limitations of DNA sequencing technologies and computational algorithms, our understanding of genetic variation and somatic mutations is far from complete.
Methods
In the present study, we performed whole-genome sequencing using long-read sequencing technology (Oxford Nanopore) for 11 Japanese liver cancers and matched normal samples which were previously sequenced for the International Cancer Genome Consortium (ICGC). We constructed an analysis pipeline for the long-read data and identified germline and somatic structural variations (SVs).
Results
In polymorphic germline SVs, our analysis identified 8004 insertions, 6389 deletions, 27 inversions, and 32 intra-chromosomal translocations. By comparing to the chimpanzee genome, we correctly inferred events that caused insertions and deletions and found that most insertions were caused by transposons and Alu is the most predominant source, while other types of insertions, such as tandem duplications and processed pseudogenes, are rare. We inferred mechanisms of deletion generations and found that most non-allelic homolog recombination (NAHR) events were caused by recombination errors in SINEs. Analysis of somatic mutations in liver cancers showed that long reads could detect larger numbers of SVs than a previous short-read study and that mechanisms of cancer SV generation were different from that of germline deletions.
Conclusions
Our analysis provides a comprehensive catalog of polymorphic and somatic SVs, as well as their possible causes. Our software are available at https://github.com/afujimoto/CAMPHOR and https://github.com/afujimoto/CAMPHORsomatic.
Background
Next-generation sequencing has allowed for the identification of different genetic variations, which are known to contribute to diseases. Of these, insertions and deletions are the second most abundant type of variations in the genome, but their biological importance or disease association is not well-studied, especially for deletions of intermediate sizes.
Methods
We identified intermediate-sized deletions from whole-genome sequencing (WGS) data of Japanese samples (
n
= 174) with a novel deletion calling method which considered multiple samples. These deletions were used to construct a reference panel for use in imputation. Imputation was then conducted using the reference panel and data from 82 publically available Japanese samples with gene expression data. The accuracy of the deletion calling and imputation was examined with Nanopore long-read sequencing technology. We also conducted an expression quantitative trait loci (eQTL) association analysis using the deletions to infer their functional impacts on genes, before characterizing the deletions causal for gene expression level changes.
Results
We obtained a set of polymorphic 4378 high-confidence deletions and constructed a reference panel. The deletions were successfully imputed into the Japanese samples with high accuracy (97.3%). The eQTL analysis identified 181 deletions (4.1%) suggested as causal for gene expression level changes. The causal deletion candidates were significantly enriched in promoters, super-enhancers, and transcription elongation chromatin states. Generation of deletions in a cell line with the CRISPR-Cas9 system confirmed that they were indeed causative variants for gene expression change. Furthermore, one of the deletions was observed to affect the gene expression levels of a gene it was not located in.
Conclusions
This paper reports an accurate deletion calling method for genotype imputation at the whole genome level and shows the importance of intermediate-sized deletions in the human population.
Electronic supplementary material
The online version of this article (10.1186/s13073-019-0656-4) contains supplementary material, which is available to authorized users.
In thirteen patients with normal liver function, the mean concentrations of cefbuperazone in hepatic bile, gall bladder bile and gallbladder tissue 30 min after injection were 1134.8 +/- 36.8 (mean +/- S.E.M.) mg/l, 6.6 +/- 3.0 mg/l and 26.1 +/- 7.6 mg/l, respectively. In patients with obstructive jaundice, cefbuperazone concentrations in bile were 99 +/- 29.2 mg/l (mean +/- S.E.M.) 1 h post-dose and decreased to 13.9 +/- 5.1 mg/l 6 h post-dose. In both groups of patients biliary concentrations of cefbuperazone were higher than the MICs of most organisms causing biliary infection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.