Background Extrachromosomal circular DNAs (eccDNAs) are ring-like DNA structures physically separated from the chromosomes with 100 bp to several megabasepairs in size. Apart from carrying tandemly repeated DNA, eccDNAs may also harbor extra copies of genes or recently activated transposable elements. As eccDNAs occur in all eukaryotes investigated so far and likely play roles in stress, cancer, and aging, they have been prime targets in recent research—with their investigation limited by the scarcity of computational tools. Results Here, we present the ECCsplorer, a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing techniques. Following Illumina-sequencing of amplified circular DNA (circSeq), the ECCsplorer enables an easy and automated discovery of eccDNA candidates. The data analysis encompasses two major procedures: first, read mapping to the reference genome allows the detection of informative read distributions including high coverage, discordant mapping, and split reads. Second, reference-free comparison of read clusters from amplified eccDNA against control sample data reveals specifically enriched DNA circles. Both software parts can be run separately or jointly, depending on the individual aim or data availability. To illustrate the wide applicability of our approach, we analyzed semi-artificial and published circSeq data from the model organisms Homo sapiens and Arabidopsis thaliana, and generated circSeq reads from the non-model crop plant Beta vulgaris. We clearly identified eccDNA candidates from all datasets, with and without reference genomes. The ECCsplorer pipeline specifically detected mitochondrial mini-circles and retrotransposon activation, showcasing the ECCsplorer’s sensitivity and specificity. Conclusion The ECCsplorer (available online at https://github.com/crimBubble/ECCsplorer) is a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing data. The derived eccDNA targets are valuable for a wide range of downstream investigations—from analysis of cancer-related eccDNAs over organelle genomics to identification of active transposable elements.
Supplementary data are available at Bioinformatics online.
SUMMARYShort interspersed nuclear elements (SINEs) are highly abundant non-autonomous retrotransposons that are widespread in plants. They are short in size, non-coding, show high sequence diversity, and are therefore mostly not or not correctly annotated in plant genome sequences. Hence, comparative studies on genomic SINE populations are rare. To explore the structural organization and impact of SINEs, we comparatively investigated the genome sequences of the Solanaceae species potato (Solanum tuberosum), tomato (Solanum lycopersicum), wild tomato (Solanum pennellii), and two pepper cultivars (Capsicum annuum). Based on 8.5 Gbp sequence data, we annotated 82 983 SINE copies belonging to 10 families and subfamilies on a base pair level. Solanaceae SINEs are dispersed over all chromosomes with enrichments in distal regions. Depending on the genome assemblies and gene predictions, 30% of all SINE copies are associated with genes, particularly frequent in introns and untranslated regions (UTRs). The close association with genes is family specific. More than 10% of all genes annotated in the Solanaceae species investigated contain at least one SINE insertion, and we found genes harbouring up to 16 SINE copies. We demonstrate the involvement of SINEs in gene and genome evolution including the donation of splice sites, start and stop codons and exons to genes, enlargement of introns and UTRs, generation of tandem-like duplications and transduction of adjacent sequence regions.
SUMMARYShort interspersed nuclear elements (SINEs) are non-autonomous non-long terminal repeat retrotransposons which are widely distributed in eukaryotic organisms. While SINEs have been intensively studied in animals, only limited information is available about plant SINEs. We analysed 22 SINE families from seven genomes of the Amaranthaceae family and identified 34 806 SINEs, including 19 549 full-length copies. With the focus on sugar beet (Beta vulgaris), we performed a comparative analysis of the diversity, genomic and chromosomal organization and the methylation of SINEs to provide a detailed insight into the evolution and age of Amaranthaceae SINEs. The lengths of consensus sequences of SINEs range from 113 nucleotides (nt) up to 224 nt. The SINEs show dispersed distribution on all chromosomes but were found with higher incidence in subterminal euchromatic chromosome regions. The methylation of SINEs is increased compared with their flanking regions, and the strongest effect is visible for cytosines in the CHH context, indicating an involvement of asymmetric methylation in the silencing of SINEs.
SummaryShort interspersed nuclear elements (SINEs) are small, non‐autonomous and heterogeneous retrotransposons that are widespread in plants. To explore the amplification dynamics and evolutionary history of SINE populations in representative deciduous tree species, we analyzed the genomes of the six following Salicaceae species: Populus deltoides, Populus euphratica, Populus tremula, Populus tremuloides, Populus trichocarpa, and Salix purpurea. We identified 11 Salicaceae SINE families (SaliS‐I to SaliS‐XI), comprising 27 077 full‐length copies. Most of these families harbor segmental similarities, providing evidence for SINE emergence by reshuffling or heterodimerization. We observed two SINE groups, differing in phylogenetic distribution pattern, similarity and 3′ end structure. These groups probably emerged during the ‘salicoid duplication’ (~65 million years ago) in the Salix–Populus progenitor and during the separation of the genus Salix (45–65 million years ago), respectively. In contrast to conserved 5′ start motifs across species and SINE families, the 3′ ends are highly variable in sequence and length. This extraordinary 3′‐end variability results from mutations in the poly(A) tail, which were fixed by subsequent amplificational bursts. We show that the dissemination of newly evolved 3′ ends is accomplished by a displacement of older motifs, leading to various 3′‐end subpopulations within the SaliS families.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.