2021
DOI: 10.1101/2021.07.12.452052
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Complete genomic and epigenetic maps of human centromeres

Abstract: Existing human genome assemblies have almost entirely excluded highly repetitive sequences within and near centromeres, limiting our understanding of their sequence, evolution, and essential role in chromosome segregation. Here, we present an extensive study of newly assembled peri/centromeric sequences representing 6.2% (189.9 Mb) of the first complete, telomere-to-telomere human genome assembly (T2T-CHM13). We discovered novel patterns of peri/centromeric repeat organization, variation, and evolution at both… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

1
65
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
4

Relationship

2
6

Authors

Journals

citations
Cited by 26 publications
(66 citation statements)
references
References 101 publications
1
65
0
Order By: Relevance
“…Pacific Biosciences (PacBio) uses repeated sequencing of a circular molecule to build consensus across time 2 . The accuracy of these approaches, and the manner they fail, ultimately limits the read lengths of these methods and the analyzable regions of the genome 3,4…”
Section: Introductionmentioning
confidence: 99%
“…Pacific Biosciences (PacBio) uses repeated sequencing of a circular molecule to build consensus across time 2 . The accuracy of these approaches, and the manner they fail, ultimately limits the read lengths of these methods and the analyzable regions of the genome 3,4…”
Section: Introductionmentioning
confidence: 99%
“…Datasets. We extracted the alpha satellite arrays from the assembly (public release v1.0) of the effectively haploid CHM13 human cell line constructed by the T2T Consortium (Miga et al, 2020;, Nurk et al, 2021, Altemose et al, 2021. We also extracted the alpha satellite array of the newly assembled centromere of chromosome X from HG002 cell line sequenced by the Human Pangenome Reference Consortium.…”
Section: Resultsmentioning
confidence: 99%
“…HORmon launches CentromereArchitect (Dvorkina et al, 2021) to generate the initial monomer-set and further modifies it by using the monomer-HOR feedback loop described in the Methods section (Figure HORmonPipeline). Supplementary Note "HORmon monomer naming" describes how HORmon assigns names to monomers and provides correspondence between these names and the traditional names described in Uralsky et al, 2019. Since CentromereArchitect identifies many infrequent monomers, comparing its monomer-set with the previously identified monomer-sets, e.g., the monomer-set MonomersT2T (Altemose et al, 2021) used by the T2T consortium (based on the monomer-set derived in Shepelev et al 2015 andUralsky et al, 2019), is not straightforward. HORmon thus filters the monomer-set generated by CentromereArchitect as described below.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the AvaI repeats only represent a very small fraction of the centromeric regions described, with the largest one only covering 14 kb [33], whereas the estimation of the extent of the centromeres, based on a GC content lower than the genome average is much larger although imprecise and supposes a similar organization as for the AT-rich alpha-satellite repeats in vertebrates, such as human [69].…”
mentioning
confidence: 99%