Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.
MicroRNAs (miRNAs) are short non-coding RNAs with key roles in cellular regulation. As part of the fifth edition of the Functional Annotation of Mammalian Genome (FANTOM5) project, we created an integrated expression atlas of miRNAs and their promoters by deep-sequencing 492 short RNA (sRNA) libraries, with matching Cap Analysis Gene Expression (CAGE) data, from 396 human and 47 mouse RNA samples. Promoters were identified for 1,357 human and 804 mouse miRNAs and showed strong sequence conservation between species. We also found that primary and mature miRNA expression levels were correlated, allowing us to use the primary miRNA measurements as a proxy for mature miRNA levels in a total of 1,829 human and 1,029 mouse CAGE libraries. We thus provide a broad atlas of miRNA expression and promoters in primary mammalian cells, establishing a foundation for detailed analysis of miRNA expression patterns and transcriptional control regions.
Type 2 (or North American-like) porcine reproductive and respiratory syndrome virus (PRRSV) was first recorded in 1987 in the United States and now occurs in most commercial swine industries throughout the world. In this study, we investigated the epidemiological and evolutionary behaviors of type 2 PRRSV. Based on phylogenetic analyses of 8,624 ORF5 sequences, we described a comprehensive picture of the diversity of type 2 PRRSVs and systematically classified all available sequences into lineages and sublineages, including a number of previously undescribed lineages. With the rapid growth of sequence deposition into the databases, it would be technically difficult for veterinary researchers to genotype their sequences by reanalyzing all sequences in the databases. To this end, a set of reference sequences was established based on our classification system, which represents the principal diversity of all available sequences and can readily be used for further genotyping studies. In addition, we further investigated the demographic histories of these lineages and sublineages by using Bayesian coalescence analyses, providing evolutionary insights into several important epidemiological events of type 2 PRRSV. Moreover, by using a phylogeographic approach, we were able to estimate the transmission frequencies between the pig-producing states in the United States and identified several states as the major sources of viral spread, i.e., "transmission centers." In summary, this study represents the most extensive phylogenetic analyses of type 2 PRRSV to date, providing a basis for future genotyping studies and dissecting the epidemiology of type 2 PRRSV from phylogenetic perspectives.Porcine reproductive and respiratory syndrome virus (PRRSV) is an economically important virus which infects swine and causes reproductive failure in sows and respiratory problems in growing pigs. As a member of the Arteriviridae family (15,47,59,66), PRRSV has a positive-sense RNA genome of approximately 15 kb that carries eight overlapping open reading frames (ORFs), designated ORFs 1a, 1b, and 2 to 7 (15, 47). Among these ORFs, ORF5, encoding the major envelope glycoprotein, is an ideal candidate for phylogenetic tree construction, because it exhibits marked genetic variation within its relatively short length.PRRSV can be classified into two genotypes: type 1 (EUlike), comprising mainly European strains and represented by the prototype strain Lelystad (75); and type 2 (NA-like), comprising mainly North American strains and represented by the prototype strain VR-2332 (14). Although clinical diseases are similar following infections with these viruses, they differ significantly in terms of antigenic properties (18, 74) and genetic content (42,48,51). This has sparked hot debates on the evolutionary history and divergence time of these two genotypes (24,25,29,58), but no substantial consensus has been reached.Classification and epidemiology of type 2 PRRSV. Clinical disease due to type 2 PRRSV was first recorded in 1987 in the United Stat...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.