Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques.
Genome organization is driven by forces affecting transcriptional state, but the relationship between transcription and genome architecture remains unclear. Here, we identified the Drosophila transcription factor Motif 1 Binding Protein (M1BP) in physical association with the gypsy chromatin insulator core complex, including the universal insulator protein CP190. M1BP is required for enhancer-blocking and barrier activities of the gypsy insulator as well as its proper nuclear localization. Genome-wide, M1BP specifically colocalizes with CP190 at Motif 1-containing promoters, which are enriched at topologically associating domain (TAD) borders. M1BP facilitates CP190 chromatin binding at many shared sites and vice versa. Both factors promote Motif 1-dependent gene expression and transcription near TAD borders genome-wide. Finally, loss of M1BP reduces chromatin accessibility and increases both inter- and intra-TAD local genome compaction. Our results reveal physical and functional interaction between CP190 and M1BP to activate transcription at TAD borders and mediate chromatin insulator-dependent genome organization.
BackgroundNon-human primates (NHPs) and humans share major biological mechanisms, functions, and responses due to their close evolutionary relationship and, as such, provide ideal animal models to study human diseases. RNA expression in NHPs provides specific signatures that are informative of disease mechanisms and therapeutic modes of action. Unlike the human transcriptome, the transcriptomes of major NHP animal models are yet to be comprehensively annotated.ResultsIn this manuscript, employing deep RNA sequencing of seven tissue samples, we characterize the transcriptomes of two commonly used NHP animal models: Cynomolgus macaque (Macaca fascicularis) and African green monkey (Chlorocebus aethiops). We present the Multi-Species Annotation (MSA) pipeline that leverages well-annotated primate species and annotates 99.8% of reconstructed transcripts. We elucidate tissue-specific expression profiles and report 13 experimentally validated novel transcripts in these NHP animal models.ConclusionWe report comprehensively annotated transcriptomes of two non-human primates, which we have made publically available on a customized UCSC Genome Browser interface. The MSA pipeline is also freely available.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-846) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.