The ciliate protozoan Tetrahymena thermophila contains two types of structurally and functionally differentiated nuclei: the transcriptionally active somatic macronucleus (MAC) and the transcriptionally silent germ-line micronucleus (MIC). Here, we demonstrate that MAC features well-positioned nucleosomes downstream of transcription start sites and flanking splice sites. Transcription-associated trans-determinants promote nucleosome positioning in MAC. By contrast, nucleosomes in MIC are dramatically delocalized. Nucleosome occupancy in MAC and MIC are nonetheless highly correlated with each other, as well as with in vitro reconstitution and predictions based upon DNA sequence features, revealing unexpectedly strong contributions from cis-determinants. In particular, well-positioned nucleosomes are often matched with GC content oscillations. As many nucleosomes are coordinately accommodated by both cis- and trans-determinants, we propose that their distribution is shaped by the impact of these nucleosomes on the mutational and transcriptional landscape, and driven by evolutionary selection.
Side effect machines produce features for classifiers that distinguish different types of DNA sequences. They have the, as yet unexploited, potential to give insight into biological features of the sequences. We introduce several innovations to the production and use of side effect machine sequence features. We compare the results of using consensus sequences and genomic sequences for training classifiers and find that more accurate results can be obtained using genomic sequences. Surprisingly, we were even able to build a classifier that distinguished consensus sequences from genomic sequences with high accuracy, suggesting that consensus sequences are not always representative of their genomic counterparts. We apply our techniques to the problem of distinguishing two types of transposable elements, solo LTRs and SINEs. Identifying these sequences is important because they affect gene expression,genome structure, and genetic diversity, and they serve as genetic markers. They are of similar length, neither codes for protein, and both have many nearly identical copies throughout the genome. Being able to efficiently and automatically distinguish them will aid efforts to improve annotations of genomes. Our approach reveals structural characteristics of the sequences of potential interest to biologists.
The most controversial part of genetic programming is its highly disruptive and potentially innovative subtree crossover operator. The clearest problem with the crossover operator is its potential to induce defensive metaselection for large parse trees, a process usually termed "bloat." Single parent genetic programming is a form of genetic programming in which bloat is reduced by doing subtree crossover with a fixed population of ancestor trees. Analysis of mean tree size growth demonstrates that this fixed and limited set of crossover partners provides implicit, automatic control on tree size in the evolving population, reducing the need for additionally disruptive trimming of large trees. The choice of ancestor trees can also incorporate expert knowledge into the genetic programming system. The system is tested on four problems: plus-one-recall-store (PORS), odd parity, plus-times-half (PTH) and a bioinformatic model fitting problem (NIPs). The effectiveness of the technique varies with the problem and choice of ancestor set. At the extremes, improvements in time to solution in excess of 4700-fold were observed for the PORS problem, and no significant improvements for the PTH problem were observed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.