Haplotype diversity and sequence heterogeneity of human telomeres

Grigorev, Kirill; Foox, Jonathan; Bezdan, Daniela; Butler, Daniel; Luxton, Jared J.; Reed, Jake; McKenna, Miles J.; Taylor, Lynn E.; George, K.; Meydan, Cem; Bailey, Susan M.; Mason, Christopher E.

doi:10.1101/gr.274639.120

Cited by 35 publications

(35 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similar to several other studies [16, 9], our results showed that the telomeres and centromeres in the human genome are major sources of genetic variation. The combination of long-read technology and the T2T-CHM13 reference genome will likely open up new doors for population-scale studies on roles these satellite sequences to human health and disease [17].…”

Section: Discussionsupporting

confidence: 91%

An Algorithm for Sequence Location Approximation using Nuclear Families (ASLAN) Validates Regions of the Telomere-to-Telomere Assembly and Identifies New Hotspots for Genetic Diversity

Chrisman

Paskov

et al. 2022

Preprint

View full text Add to dashboard Cite

Although it is heavily relied on to study genetic contributors to health and disease, the current human reference genome (GRCh38) is incomplete in two major ways: firstly, it is missing large sections of heterochromatic sequence, and secondly, as a singular, linear reference genome it does not represent the full spectrum of genetic diversity that exists in the human species. In order to better understand and characterize gaps in GRCh38 and genetic diversity, we developed a method - ASLAN, an Algorithm for Sequence Location Approximation using Nuclear families - that identifies the region of origin of short reads that do not align to the GRCh38. Using unmapped reads and variant calls from whole genome sequencing (WGS) data from nuclear families, ASLAN relies on a maximum likelihood model to identify the most likely region of the genome that a subsequence belongs to, given the phasing information of family and the distribution of the subsequence in the unmapped reads. Validating ASLAN on a synthetically generated dataset, and on true reads originating from the alternative haplotypes in the decoy genome, we show that ASLAN can localize more than 90% of 100-basepair sequences with above 92% accuracy and around 1 megabase of resolution. We then run ASLAN on 100-mers from unmapped reads from WGS from over 700 families, and compare ASLAN localizations to alignment of the 100-mers to the T2T-CHM13 assembly, recently released by the Telomere-to-telomere (T2T) consortia. We find that many unmapped reads in GRCh38 originate from telomeres and centromeres that are gaps in the GRCh38 reference. We also confirm that ASLAN localizations are in high concordance with T2T-CHM13 alignments, except in the centromeres of the acrocentric chromosomes. Comparing ASLAN localizations and T2T-CHM13 alignments, we identify sequences missing from T2T-CHM13 or sequences with high divergence from their aligned region in T2T-CHM13, thus highlighting new hotspots for genetic diversity.

show abstract

Section: Discussionsupporting

confidence: 91%

An Algorithm for Sequence Location Approximation using Nuclear Families (ASLAN) Validates Regions of the Telomere-to-Telomere Assembly and Identifies New Hotspots for Genetic Diversity

Chrisman

Paskov

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…It is interesting to note that telomere-like sequences are also frequently found near telomeric regions [19][20][21][22]. Specifically, there are three main types of telomere-like repeat sequences that are frequently found near telomeres in the human genome, namely the c-type repeats (TCA GGG ) n , g-type repeats (TGA GGG ) n, and , j-type repeats (TTG GGG ) n [23].…”

Section: Resultsmentioning

confidence: 99%

Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres

et al. 2022

View full text Add to dashboard Cite

Nanopore long-read sequencing is an emerging approach for studying genomes, including long repetitive elements like telomeres. Here, we report extensive basecalling induced errors at telomere repeats across nanopore datasets, sequencing platforms, basecallers, and basecalling models. We find that telomeres in many organisms are frequently miscalled. We demonstrate that tuning of nanopore basecalling models leads to improved recovery and analysis of telomeric regions, with minimal negative impact on other genomic regions. We highlight the importance of verifying nanopore basecalls in long, repetitive, and poorly defined regions, and showcase how artefacts can be resolved by improvements in nanopore basecalling models.

show abstract

“…Telomere length measurement by nanopore sequencing will allow the telomere field to study new questions and revisit past unanswered questions in telomere biology. A different method of long-read sequencing, Pacific Biosciences (PacBio) sequencing, has recently been applied to human telomere sequences, (Grigorev et al 2021) indicating longread sequencing technologies are widely useful in telomere research. It will be of interest to determine whether chromosome end-specific telomere length differences are generalizable to other organisms, as well as perhaps even humans, and understanding how they are established and maintained will be a fascinating new area of telomere biology to explore.…”

Section: Discussionmentioning

confidence: 99%

Chromosome-specific telomere lengths and the minimal functional telomere revealed by nanopore sequencing

et al. 2021

View full text Add to dashboard Cite

We developed a method to tag telomeres and measure telomere length by nanopore sequencing in the yeast S. cerevisiae. Nanopore allows long-read sequencing through the telomere, subtelomere and into unique chromosomal sequence, enabling assignment of telomere length to a specific chromosome end. We observed chromosome end specific telomere lengths that were stable over 120 cell divisions. These stable chromosome-specific telomere lengths may be explained by slow clonal variation or may represent a new biological mechanism that maintains equilibrium unique to each chromosome end. We examined the role of RIF1 and TEL1 in telomere length regulation and found that TEL1 is epistatic to RIF1 at most telomeres, consistent with the literature. However, at telomeres that lack subtelomeric Y’ sequences, tel1Δ rif1Δ double mutants had a very small, but significant, increase in telomere length compared to the tel1Δ single mutant, suggesting an influence of Y’ elements on telomere length regulation. We sequenced telomeres in a telomerase-null mutant (est2Δ) and found the minimal telomere length to be around 75 bp. In these est2Δ mutants there were apparent telomere recombination events at individual telomeres before the generation of survivors, and these events were significantly reduced in est2Δ rad52Δ double mutants. The rate of telomere shortening in the absence of telomerase was similar across all chromosome ends at about 5 bp per generation. This new method gives quantitative, high resolution telomere length measurement at each individual chromosome end, and suggests possible new biological mechanisms regulating telomere length.

show abstract

Haplotype diversity and sequence heterogeneity of human telomeres

Abstract: Service Email Alerting click here. top right corner of the article or Receive free email alerts when new articles cite this article -sign up in the box at the object identifier (DOIs) and date of initial publication.

Cited by 35 publications

References 36 publications

An Algorithm for Sequence Location Approximation using Nuclear Families (ASLAN) Validates Regions of the Telomere-to-Telomere Assembly and Identifies New Hotspots for Genetic Diversity

An Algorithm for Sequence Location Approximation using Nuclear Families (ASLAN) Validates Regions of the Telomere-to-Telomere Assembly and Identifies New Hotspots for Genetic Diversity

Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres

Chromosome-specific telomere lengths and the minimal functional telomere revealed by nanopore sequencing

Contact Info

Product

Resources

About