Genomic technologies such as next-generation sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole-genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG) exp ] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG) 11 short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS-affected families and identified a core ancestral haplotype, estimated to have arisen in Europe more than twenty-five thousand years ago. WGS of the four RFC1-negative CANVAS-affected families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type, and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.
Familial Adult Myoclonic Epilepsy (FAME) is a genetically heterogeneous disorder characterized by cortical tremor and seizures. Intronic TTTTA/TTTCA repeat expansions in SAMD12 (FAME1) are the main cause of FAME in Asia. Using genome sequencing and repeat-primed PCR, we identify another site of this repeat expansion, in MARCH6 (FAME3) in four European families. Analysis of single DNA molecules with nanopore sequencing and molecular combing show that expansions range from 3.3 to 14 kb on average. However, we observe considerable variability in expansion length and structure, supporting the existence of multiple expansion configurations in blood cells and fibroblasts of the same individual. Moreover, the largest expansions are associated with micro-rearrangements occurring near the expansion in 20% of cells. This study provides further evidence that FAME is caused by intronic TTTTA/TTTCA expansions in distinct genes and reveals that expansions exhibit an unexpectedly high somatic instability that can ultimately result in genomic rearrangements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.