After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist 1,2. Here we present a human genome assembly that surpasses the continuity of GRCh38 2 , along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3 , we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes. Complete, telomere-to-telomere reference genome assemblies are necessary to ensure that all genomic variants are discovered and studied. At present, unresolved areas of the human genome are defined by multi-megabase satellite arrays in the pericentromeric regions and the ribosomal DNA arrays on acrocentric short arms, as well as regions enriched in segmental duplications that are greater than hundreds of kilobases in length and that exhibit sequence identity of more than 98% between paralogues. Owing to their absence from the reference, these repeat-rich sequences are often excluded from genetics and genomics studies, which limits the scope of association and functional analyses 4,5. Unresolved repeat sequences also result in unintended consequences; for example, paralogous sequence variants incorrectly being called as allelic variants 6 , and the contamination of bacterial gene databases 7. Completion of the entire human genome is expected to contribute to our understanding of chromosome function 8 , human disease 9 and genomic variation, which will improve technologies in biomedicine that use short-read mapping to a reference genome (for example, RNA sequencing (RNA-seq) 10 , chromatin immunoprecipitation followed by sequencing (ChIP-seq) 11 and assay for transposase-accessible chromatin using sequencing (ATAC-seq) 12). The fundamental challenge of reconstructing a genome from many comparatively short sequencing reads-a process known as genome assembly-is distinguishing the repeated sequences from one another 13. Resolving such repeats relies on sequencing reads that are long enough to span the entire repeat or accurate enough to distinguish each repeat copy on the basis of...
Up to 10% of cases of gastric cancer are familial, but so far, only mutations in CDH1 have been associated with gastric cancer risk. To identify genetic variants that affect risk for gastric cancer, we collected blood samples from 28 patients with hereditary diffuse gastric cancer (HDGC) not associated with mutations in CDH1 and performed whole-exome sequence analysis. We then analyzed sequences of candidate genes in 333 independent HDGC and non-HDGC cases. We identified 11 cases with mutations in PALB2, BRCA1, or RAD51C genes, which regulate homologous DNA recombination. We found these mutations in 2 of 31 patients with HDGC (6.5%) and 9 of 331 patients with sporadic gastric cancer (2.8%). Most of these mutations had been previously associated with other types of tumors and partially co-segregated with gastric cancer in our study. Tumors that developed in patients with these mutations had a mutation signature associated with somatic homologous recombination deficiency. Our findings indicate that defects in homologous recombination increase risk for gastric cancer.
After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist 1,2 . The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38 2 , along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3 , we reconstructed the ~2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes.Complete, telomere-to-telomere reference assemblies are necessary to ensure that all genomic variants, large and small, are discovered and studied. Currently, unresolved regions of the human genome are defined by multi-megabase satellite arrays in the pericentromeric regions and the rDNA arrays on acrocentric short arms, as well as regions enriched in segmental duplications that are greater than hundreds of kilobases in length and greater than 98% identical between paralogs. Due to their absence from the reference, these repeat-rich sequences are often excluded from contemporary genetics and genomics studies, limiting the scope of association and functional analyses 4,5 . Unresolved repeat sequences also result in unintended consequences such as paralogous sequence variants incorrectly called as allelic v ariants 6 and even the contamination of bacterial gene databases 7 . Completion of the entire human genome is expected to contribute to our understanding of chromosome function 8 and human disease 9 , and a comprehensive understanding of genomic variation will improve the driving technologies in biomedicine that currently use short-read mapping to a reference genome (e.g. RNA-seq 10 , ChIP-seq 11 , ATAC-seq 12 ).The fundamental challenge of reconstructing a genome from many comparatively short sequencing reads-a process known as genome assembly-is distinguishing the repeated sequences from one another 13 . Resolving such r...
Colorectal cancer (CRC) is a major public health problem, and its incidence is rising in developing countries. However, studies characterizing CRC clinicopathological features in cases from developing countries are still lacking. The goal of this study was to evaluate clinicopathological and demographic features in one of the largest CRC studies in Latin America.The study involved over 1525 CRC cases recruited in a multicenter study in Colombia between 2005 and 2014 as part of ongoing genetic and epidemiological studies. We gathered clinicopathological data such as age at diagnosis, sex, body mass index, tobacco and alcohol consumption, family history of cancer, and tumor features including location, histological type, and stage. Statistical analyses were performed to test the association between age of onset, sex, and clinical manifestations.The average age at CRC diagnosis was 57.4 years, with 26.5% of cases having early-onset CRC (diagnosed by age 50 years). Most cases were women (53.2%; P = 0.009), 49.2% were overweight or obese, 49.1% were regular alcohol drinkers, 52% were smokers/former smokers, and 12.2% reported relatives with cancer. Most tumors in the study were located in the rectum (42.7%), were adenocarcinomas (91.5%), and had advanced stage (T3–T4, 79.8%). Comparisons by sex found that male cases were more likely to be obese (36.5% vs 31.1%; P = 0.001), less likely to have a family history of cancer (9.7% vs 15.3%; P = 0.016), and more likely to have advanced-stage tumors (83.9% vs 76.1%; P = 0.036). Comparisons by age of onset found that early-onset cases were more likely to be women (59.3% vs 51.0%; P = 0.005) and report a family history of cancer (17.4% vs 10.2%; P = 0.001).To our knowledge, our study is the largest report of clinicopathological characterization of Hispanic CRC cases, and we suggest that further studies are needed to understand CRC etiology in diverse Hispanic populations.
Cancer cells often have unstable genomes and increased centrosome and chromosome numbers, which play an important part of malignant transformation in the most recent models tumorigenesis. However, very little is known about divisional failures in cancer cells that may lead to chromosomal and centrosomal amplifications. We show here that cancer cells often failed at cytokinesis due to decreased phosphorylation of the myosin regulatory light chain (MLC), a key regulatory component of cortical contraction during division. Reduced MLC phosphorylation was associated with high expression of myosin phosphatase and/or reduced myosin light chain kinase levels. Furthermore, expression of phosphomimetic MLC largely prevented cytokinesis failure in the tested cancer cells. When myosin light chain phosphorylation was restored to normal levels by phosphatase knockdown multinucleation, and multipolar mitosis were both markedly reduced, resulting in enhanced genome stabilization. Furthermore, both overexpression of myosin phosphatase or inhibition of the myosin light chain kinase (MLCK) in nonmalignant cells can recapitulate some of the mitotic defects of cancer cells, including multinucleation and multipolar spindles, indicating these changes are sufficient to reproduce the cytokinesis failures we see in cancer cells. These results for the first time define the molecular defects leading to divisional failure in cancer cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.