The transcription factor affinity prediction (TRAP) method calculates the affinity of transcription factors for DNA sequences on the basis of a biophysical model. This method has proven to be useful for several applications, including for determining the putative target genes of a given factor. This protocol covers two other applications: (i) determining which transcription factors have the highest affinity in a set of sequences (illustrated with chromatin immunoprecipitation-sequencing (ChIP-seq) peaks), and (ii) finding which factor is the most affected by a regulatory single-nucleotide polymorphism. The protocol describes how to use the TRAP web tools to address these questions, and it also presents a way to run TRAP on random control sequences to better estimate the significance of the results. All of the tools are fully available online and do not need any additional installation. The complete protocol takes about 45 min, but each individual tool runs in a few minutes.
Researchers, supported by data from polyploid plants, have suggested that whole genome duplication (WGD) may induce genomic instability and rearrangement, an idea which could have important implications for vertebrate evolution. Benefiting from the newly released amphioxus genome sequence (Branchiostoma floridae), an invertebrate that researchers have hoped is representative of the ancestral chordate genome, we have used gene proximity conservation to estimate rates of genome rearrangement throughout vertebrates and some of their invertebrate ancestors. We find that, while amphioxus remains the best single source of invertebrate information about the early chordate genome, its genome structure is not particularly well conserved and it cannot be considered a fossilization of the vertebrate preduplication genome. In agreement with previous reports, we identify two WGD events in early vertebrates and another in teleost fish. However, we find that the early vertebrate WGD events were not followed by increased rates of genome rearrangement. Indeed, we measure massive genome rearrangement prior to these WGD events. We propose that the vertebrate WGD events may have been symptoms of a preexisting predisposition toward genomic structural change.
Animal genomes possess highly conserved cis-regulatory sequences that are often found near genes that regulate transcription and development. Researchers have proposed that the strong conservation of these sequences may affect the evolution of the surrounding genome, both by repressing rearrangement, and possibly by promoting duplicate gene retention. Conflicting data, however, have made the validity of these propositions unclear. Here, we use a new computational method to identify phylogenetically conserved noncoding elements (PCNEs) in a manner that is not biased by rearrangement and duplication. This method is powerful enough to identify more than a thousand PCNEs that have been conserved between vertebrates and the basal chordate amphioxus. We test 42 of our PCNEs in transgenic zebrafish assays-including examples from vertebrates and amphioxus-and find that the majority are functional enhancers. We find that PCNEs are enriched around genes with ancient synteny conservation, and that this association is strongest for extragenic PCNEs, suggesting that cis-regulatory interdigitation plays a key role in repressing genome rearrangement. Next, we classify mouse and zebrafish genes according to association with PCNEs, synteny conservation, duplication history, and presence in bidirectional promoter pairs, and use these data to cluster gene functions into a series of distinct evolutionary patterns. These results demonstrate that subfunctionalization of conserved cis-regulation has not been the primary determinate of gene duplicate retention in vertebrates. Instead, the data support the gene balance hypothesis, which proposes that duplicate retention has been driven by selection against dosage imbalances in genes with many protein connections.
Open access to sequence data is a cornerstone of biology and biodiversity research, but has created tension under the United Nations Convention on Biological Diversity (CBD). Policy decisions could compromise research and development, unless a practical multilateral solution is implemented.Here, we lay out a framework for use of digital sequence information (DSI) that enables fair benefit-sharing, ensures open access to sequence data, strengthens biodiversity conservation and sustainable use, and leverages genomics and bioinformatics for international capacity-building. As Parties to the CBD meet again in-person in the coming months to negotiate the Global Biodiversity Framework, they must apply pragmatic, multilateral solutions to DSI that improve rather than impede global biodiversity targets.The ability to decode and digitally archive DNA has revolutionized the life sciences and related fields. Sequence data, referred to as digital sequence information (DSI) in policy
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.