Human papillomavirus (HPV) integration is a key genetic event in cervical carcinogenesis. By conducting whole-genome sequencing and high-throughput viral integration detection, we identified 3,667 HPV integration breakpoints in 26 cervical intraepithelial neoplasias, 104 cervical carcinomas and five cell lines. Beyond recalculating frequencies for the previously reported frequent integration sites POU5F1B (9.7%), FHIT (8.7%), KLF12 (7.8%), KLF5 (6.8%), LRP1B (5.8%) and LEPREL1 (4.9%), we discovered new hot spots HMGA2 (7.8%), DLG2 (4.9%) and SEMA3D (4.9%). Protein expression from FHIT and LRP1B was downregulated when HPV integrated in their introns. Protein expression from MYC and HMGA2 was elevated when HPV integrated into flanking regions. Moreover, microhomologous sequence between the human and HPV genomes was significantly enriched near integration breakpoints, indicating that fusion between viral and human DNA may have occurred by microhomology-mediated DNA repair pathways. Our data provide insights into HPV integration-driven cervical carcinogenesis.
Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.
The ongoing global novel coronavirus pneumonia COVID‐19 outbreak has engendered numerous cases of infection and death. COVID‐19 diagnosis relies upon nucleic acid detection; however, currently recommended methods exhibit high false‐negative rates and are unable to identify other respiratory virus infections, thereby resulting in patient misdiagnosis and impeding epidemic containment. Combining the advantages of targeted amplification and long‐read, real‐time nanopore sequencing, herein, nanopore targeted sequencing (NTS) is developed to detect SARS‐CoV‐2 and other respiratory viruses simultaneously within 6–10 h, with a limit of detection of ten standard plasmid copies per reaction. Compared with its specificity for five common respiratory viruses, the specificity of NTS for SARS‐CoV‐2 reaches 100%. Parallel testing with approved real‐time reverse transcription‐polymerase chain reaction kits for SARS‐CoV‐2 and NTS using 61 nucleic acid samples from suspected COVID‐19 cases show that NTS identifies more infected patients (22/61) as positive, while also effectively monitoring for mutated nucleic acid sequences, categorizing types of SARS‐CoV‐2, and detecting other respiratory viruses in the test sample. NTS is thus suitable for COVID‐19 diagnosis; moreover, this platform can be further extended for diagnosing other viruses and pathogens.
Efficient crop improvement depends on the application of accurate genetic information contained in diverse germplasm resources. Here we report a reference-grade genome of wild soybean accession W05, with a final assembled genome size of 1013.2 Mb and a contig N50 of 3.3 Mb. The analytical power of the W05 genome is demonstrated by several examples. First, we identify an inversion at the locus determining seed coat color during domestication. Second, a translocation event between chromosomes 11 and 13 of some genotypes is shown to interfere with the assignment of QTLs. Third, we find a region containing copy number variations of the Kunitz trypsin inhibitor (KTI) genes. Such findings illustrate the power of this assembly in the analysis of large structural variations in soybean germplasm collections. The wild soybean genome assembly has wide applications in comparative genomic and evolutionary studies, as well as in crop breeding and improvement programs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.