Background: The majority of human non-protein-coding DNA is made up of repetitive sequences, mainly transposable elements (TEs). It is becoming increasingly apparent that many of these repetitive DNA sequence elements encode gene regulatory functions. This fact has important evolutionary implications, since repetitive DNA is the most dynamic part of the genome. We set out to assess the evolutionary rate and pattern of experimentally characterized human transcription factor binding sites (TFBS) that are derived from repetitive versus non-repetitive DNA to test whether repeat-derived TFBS are in fact rapidly evolving. We also evaluated the position-specific patterns of variation among TFBS to look for signs of functional constraint on TFBS derived from repetitive and non-repetitive DNA.
Background: Retrotransposons, the most abundant and widespread class of eukaryotic transposable elements, are believed to play a significant role in mutation and disease and to have contributed significantly to the evolution of genome structure and function. The recent sequencing of the chimpanzee genome is providing an unprecedented opportunity to study the functional significance of these elements in two closely related primate species and to better evaluate their role in primate evolution.
Background: Retrotransposons have been shown to contribute to evolution of both structure and regulation of protein coding genes. It has been postulated that the primary mechanism by which retrotransposons contribute to structural gene evolution is through insertion into an intron or a gene flanking region, and subsequent incorporation into an exon.
We analyzed the chicken (Gallus gallus) genome sequence to search for previously uncharacterized endogenous retrovirus (ERV) sequences using ab initio and combined evidence approaches. We discovered 11 novel families of ERVs that occupy more than 21 million base pairs, approximately 2%, of the chicken genome. These novel families include a number of recently active full-length elements possessing identical long terminal repeats (LTRs) as well as intact gag and pol open reading frames. The abundance and diversity of chicken ERVs we discovered underscore the utility of an approach that combines multiple methods for the identification of interspersed repeats in vertebrate genomes.Reviewers: This article was reviewed by Igor Zhulin and Itai Yanai. FindingsChicken, a modern descendant of the dinosaurs, is the first avian to have its genome sequenced [1]. Phylogenetically, its position between fish and mammals provides valuable insight into the evolution of vertebrates. The chicken genome has a size of 1.2 billion bases, approximately one third of the size of the human genome.The overall interspersed repeat, i.e. transposable element (TE), content of the chicken genome was determined to be less than 9% [1]. This fraction is considerably lower than that of mammalian genomes, where transposable elements (TEs) account for 40-50% of genomic sequences [2][3][4]. While chicken has long been a model system for the study of retroviruses [5], a mere 1.3% of the chicken genome can be classified as endogenous retroviruses (ERVs) compared to about 5% in humans [3]. Nevertheless, protein coding sequences still make up only a minor fraction of the chicken genome leaving a substantial quotient that has yet to be been accounted for. The authors of the initial analysis of the chicken genome posited that much of the uncharacterized sequence was likely to be derived from unrecognized TEs [1]. Indeed, novel or previously uncharacterized TE sequences may be missed by homology-based methods for the detection of repeats, such as the widely used RepeatMasker program [6], which rely on the comparison of genomic sequences to libraries of known repeat consensus sequences. Ab initio methods, on the other hand, identify repeats by virtue of their structural characteristics without regard to any sequence similarity to known elements. We used a combination of ab initio detection, sequence similarity searches, motif identification and evaluation of element structural (repeat) features to search for novel ERVs that may have been missed in the initial analysis of the chicken genome.LTR_STRUC was the first ab initio program designed to detect long terminal repeat (LTR) containing elements, such as ERVs, in genomic sequence [7]. Briefly, LTR_STRUC works by sliding a window along genomic
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.