Genes differentially expressed in different tissues, during development, or during specific pathologies are of foremost interest to both basic and pharmaceutical research. “Transcript profiles” or “digital Northerns” are generated routinely by partially sequencing thousands of randomly selected clones from relevant cDNA libraries. Differentially expressed genes can then be detected from variations in the counts of their cognate sequence tags. Here we present the first systematic study on the influence of random fluctuations and sampling size on the reliability of this kind of data. We establish a rigorous significance test and demonstrate its use on publicly available transcript profiles. The theory links the threshold of selection of putatively regulated genes (e.g., the number of pharmaceutical leads) to the fraction of false positive clones one is willing to risk. Our results delineate more precisely and extend the limits within which digital Northern data can be used.
International audienceCoccolithophores have influenced the global climate for over 200 million years1. These marine phytoplankton can account for 20 per cent of total carbon fixation in some systems2. They form blooms that can occupy hundreds of thousands of square kilometres and are distinguished by their elegantly sculpted calcium carbonate exoskeletons (coccoliths), rendering them visible from space3. Although coccolithophores export carbon in the form of organic matter and calcite to the sea floor, they also release CO2 in the calcification process. Hence, they have a complex influence on the carbon cycle, driving either CO2 production or uptake, sequestration and export to the deep ocean4. Here we report the first haptophyte reference genome, from the coccolithophore Emiliania huxleyi strain CCMP1516, and sequences from 13 additional isolates. Our analyses reveal a pan genome (core genes plus genes distributed variably between strains) probably supported by an atypical complement of repetitive sequence in the genome. Comparisons across strains demonstrate that E. huxleyi, which has long been considered a single species, harbours extensive genome variability reflected in different metabolic repertoires. Genome variability within this species complex seems to underpin its capacity both to thrive in habitats ranging from the equator to the subarctic and to form large-scale episodic blooms under a wide variety of environmental conditions
With DNA genomes up to 2.5 Mb packed in particles of bacterium-like shape and dimension, the first two Acanthamoeba-infecting Pandoraviruses remained the most spectacular viruses since their description in 2013. Our isolation of three new strains from distant locations and environments allowed us to perform the first comparative genomics analysis of the emerging worldwide-distributed Pandoraviridae family. Thorough annotation of the genomes combining transcriptomic, proteomic, and bioinformatic analyses, led to the discovery of many non-coding transcripts while significantly reducing the former set of predicted protein-coding genes. We found that the Pandoraviridae exhibit an open pan genome, the enormous size of which is not adequately explained by gene duplications or horizontal transfers. As most of the strain specific genes have no extant homolog and exhibit statistical features comparable to intergenic regions, we suggests that de novo gene creation is a strong component in the evolution of the giant Pandoravirus genomes.
Alternate polyadenylation is an important post-transcriptional regulatory process now open to large-scale analysis by use of cDNA databases. We clustered 164,000 expressed sequence tags (ESTs) into ∼15,000 groups and aligned each group to a putative mRNA 3′ end. By use of stringent criteria to discard artifactual mRNA extremities, clear evidence for alternate polyadenylation was obtained in 189 of the 1000 EST clusters studied. A number of previously unreported polyadenylation sites were identified, together with possible instances of tissue-specific differential polyadenylation. This study demonstrates that, besides quantitative aspects of gene expression, the distribution of alternate mRNA forms can be analyzed through EST sampling.
Marine protist diversity inventories have largely focused on planktonic environments, while benthic protists have received relatively little attention. We therefore hypothesize that current diversity surveys have only skimmed the surface of protist diversity in marine sediments, which may harbor greater diversity than planktonic environments. We tested this by analyzing sequences of the hypervariable V4 18S rRNA from benthic and planktonic protist communities sampled in European coastal regions. Despite a similar number of OTUs in both realms, richness estimations indicated that we recovered at least 70% of the diversity in planktonic protist communities, but only 33% in benthic communities. There was also little overlap of OTUs between planktonic and benthic communities, as well as between separate benthic communities. We argue that these patterns reflect the heterogeneity and diversity of benthic habitats. A comparison of all OTUs against the Protist Ribosomal Reference database showed that a higher proportion of benthic than planktonic protist diversity is missing from public databases; similar results were obtained by comparing all OTUs against environmental references from NCBI's Short Read Archive. We suggest that the benthic realm may therefore be the world's largest reservoir of marine protist diversity, with most taxa at present undescribed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.