The freshwater cnidarian Hydra was first described in 17021 and has been the object of study for 300 years. Experimental studies of Hydra between 1736 and 1744 culminated in the discovery of asexual reproduction of an animal by budding, the first description of regeneration in an animal, and successful transplantation of tissue between animals2. Today, Hydra is an important model for studies of axial patterning3, stem cell biology4 and regeneration5. Here we report the genome of Hydra magnipapillata and compare it to the genomes of the anthozoan Nematostella vectensis6 and other animals. The Hydra genome has been shaped by bursts of transposable element expansion, horizontal gene transfer, trans-splicing, and simplification of gene structure and gene content that parallel simplification of the Hydra life cycle. We also report the sequence of the genome of a novel bacterium stably associated with H. magnipapillata. Comparisons of the Hydra genome to the genomes of other animals shed light on the evolution of epithelia, contractile tissues, developmentally regulated transcription factors, the Spemann–Mangold organizer, pluripotency genes and the neuromuscular junction.
We develop methods to measure and characterize symmetry at multiple orders, and analyze a wide set of genomes, encompassing single- and double-stranded RNA and DNA viruses, bacteria, archae, mitochondria, and eukaryota. We quantify symmetry at orders 1 to 9 for contiguous sequences and pools of coding and non-coding upstream regions, compare the observed symmetry levels to those predicted by simple statistical models, and factor out the effect of lower-order distributions. We establish the universality and variability range of first-order strand symmetry, as well as of its higher-order extensions, and demonstrate the existence of genuine high-order symmetric constraints. We show that ubiquitous reverse-complement symmetry does not result from a single cause, such as point mutation or recombination, but rather emerges from the combined effects of a wide spectrum of mechanisms operating at multiple orders and length scales.
Three different representations for a thresholded linear equation are developed. For binary input they are shown to be representationally equivalent though their training characteristics differ. A training algorithm for linear equations is discussed. The similarities between its simplest mathematical representation (perceptron training), a formal model of animal learning (Rescorla-Wagner learning), and one mechanism of neural learning (Aplysia gill withdrawal) are pointed out. For d input features, perceptron training is shown to have a lower bound of 2d and an upper bound of dd adjusts. It is possible that the true upper bound is 4d, though this has not been proved. Average performance is shown to have a lower bound of 1.4d. Learning time is shown to increase linearly with the number of irrelevant or replicated features. The (X of N) function (a subset of linearly separable functions containing OR and AND) is shown to be learnable in d3 time. A method of utilizing conditional probability to accelerate learning is proposed. This reduces the observed growth rate from 4d to the theoretical minimum (for unmodified version) of 2d. A different version reduces the growth rate to about 1.7d. The linear effect of irrelevant features can also be eliminated. Whether such an approach can be made probably convergent is not known.
Single and double-stranded spatial distributions of most over-represented k-mers are highly non-random, and predominantly cluster into a small number of classes that are robust with respect to over-representation measures. Specifically, we show that the three most common distribution patterns can be related to DNA structure, function, and evolution and correspond to: (a) homologous ORF clusters associated with sharply localized distributions; (b) regulatory elements associated with a symmetric broad hill-shaped distribution in the 50-200 bp USR; and (c) runs of As, Ts, and ATs associated with a broad hill-shaped distribution also in the 50-200 bp USR, with extreme structural properties. Analysis of over-representation, homology, localization, and DNA structure are essential components of a general data-mining approach to finding biologically important k-mers in raw genomic DNA and understanding the 'lexicon' of regulatory regions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.