Background: Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities.
Results:We have developed a computational workflow for similarity-based detection and downstream analysis of satellite repeats in individual nanopore reads that led to genome-wide characterization of their properties. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing eleven major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73x genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTRretrotransposons that occasionally expanded in length. While the corresponding LTRretrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favorable for satellite DNA accumulation.
Conclusions:The presented approach proved to be efficient in revealing differences in longrange organization of satellite repeats that can be used to investigate their origin and evolution in the genome.
Ávila Robledillo L, KoblížkováA, Novák P, Böttinger K, Vrbová I, Neumann P, Schubert I, Macas J. 2018. Satellite DNA in Vicia faba is characterized by remarkable diversity in its sequence composition, association with centromeres, and replication timing. Scientific Reports 8: 5838. Ceccarelli M, Sarri V, Polizzi E, Andreozzi G, Cionini PG. 2010. Characterization, evolution and chromosomal distribution of two satellite DNA sequence families in Lathyrus species. Cytogenetic and Genome Research 128: 236-244. Cechova M, Harris RS. 2018. High inter-and intraspecific turnover of satellite repeats in great apes. bioRxiv: doi:10.1101/470054. Copenhaver GP, Pikaard CS. 1996. Two-dimensional RFLP analyses reveal megabase-sized clusters of rRNA gene variants in Arabidopsis thaliana, suggesting local spreading of variants as the mode for gene homogenization during concerted evolution. The Plant Journal Taudien S, Platzer M, et al. 2013. The holocentric species Luzula elegans shows interplay between centromere and large-scale genome organization. Plant Journal 73: 555-565. Henikoff JG, Thakur J, Kasinathan S, Henikoff S. 2015. A unique chromatin complex occupies young alpha-satellite arrays of human centromeres. Science Advances 1: e1400234. Herzel H, Weiss O, Trifonov EN. 1999. 10-11 bp periodicities...