Summary The mammalian telencephalon plays critical roles in cognition, motor function, and emotion. While many of the genes required for its development have been identified, the distant-acting regulatory sequences orchestrating their in vivo expression are mostly unknown. Here we describe a digital atlas of in vivo enhancers active in subregions of the developing telencephalon. We identified over 4,600 candidate embryonic forebrain enhancers and studied the in vivo activity of 329 of these sequences in transgenic mouse embryos. We generated serial sets of histological brain sections for 145 reproducible forebrain enhancers, resulting in a publicly accessible web-based data collection comprising over 32,000 sections. We also used epigenomic analysis of human and mouse cortex tissue to directly compare the genome-wide enhancer architecture in these species. These data provide a primary resource for investigating gene regulatory mechanisms of telencephalon development and enable studies of the role of distant-acting enhancers in neurodevelopmental disorders.
BackgroundWith rapid advancements in technology, the sequences of thousands of species’ genomes are becoming available. Within the sequences are repeats that comprise significant portions of genomes. Successful annotations thus require accurate discovery of repeats. As species-specific elements, repeats in newly sequenced genomes are likely to be unknown. Therefore, annotating newly sequenced genomes requires tools to discover repeats de-novo. However, the currently available de-novo tools have limitations concerning the size of the input sequence, ease of use, sensitivities to major types of repeats, consistency of performance, speed, and false positive rate.ResultsTo address these limitations, I designed and developed Red, applying Machine Learning. Red is the first repeat-detection tool capable of labeling its training data and training itself automatically on an entire genome. Red is easy to install and use. It is sensitive to both transposons and simple repeats; in contrast, available tools such as RepeatScout and ReCon are sensitive to transposons, and WindowMasker to simple repeats. Red performed consistently well on seven genomes; the other tools performed well only on some genomes. Red is much faster than RepeatScout and ReCon and has a much lower false positive rate than WindowMasker. On human genes with five or more copies, Red was more specific than RepeatScout by a wide margin. When tested on genomes of unusual nucleotide compositions, Red located repeats with high sensitivities and maintained moderate false positive rates. Red outperformed the related tools on a bacterial genome. Red identified 46,405 novel repetitive segments in the human genome. Finally, Red is capable of processing assembled and unassembled genomes.ConclusionsRed’s innovative methodology and its excellent performance on seven different genomes represent a valuable advancement in the field of repeats discovery.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0654-5) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.