Edward M. Rubin scite author profile

Genome sequencing enhances our understanding of the biological world by providing blueprints for the evolutionary and functional diversity that shapes the biosphere. However, microbial genomes that are currently available are of limited phylogenetic breadth, owing to our historical inability to cultivate most microorganisms in the laboratory. We apply single-cell genomics to target and sequence 201 uncultivated archaeal and bacterial cells from nine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life, so-called 'microbial dark matter'. With this additional genomic information, we are able to resolve many intra-and inter-phylum-level relationships and to propose two new superphyla. We uncover unexpected metabolic features that extend our understanding of biology and challenge established boundaries between the three domains of life. These include a novel amino acid use for the opal stop codon, an archaeal-type purine synthesis in Bacteria and complete sigma factors in Archaea similar to those in Bacteria. The single-cell genomes also served to phylogenetically anchor up to 20% of metagenomic reads in some habitats, facilitating organism-level interpretation of ecosystem function. This study greatly expands the genomic representation of the tree of life and provides a systematic step towards a better understanding of biological evolution on our planet.Microorganisms are the most diverse and abundant cellular life forms on Earth, occupying every possible metabolic niche. The large majority of these organisms have not been obtained in pure culture and we have only recently become aware of their presence mainly through cultivationindependent molecular surveys based on conserved marker genes (chiefly small subunit ribosomal RNA; SSU rRNA) or through shotgun sequencing (metagenomics) 1,2 . As an increasing number of environments are deeply sequenced using next-generation technologies, diversity estimates for Bacteria and Archaea continue to rise, with the number of microbial 'species' predicted to reach well into the millions 3 . According to SSU rRNA-based phylogeny, these fall into at least 60 major lines of descent (phyla or divisions) within the bacterial and archaeal domains 4

show abstract

Initial sequencing and comparative analysis of the mouse genome

Waterston¹,

Lindblad‐Toh²,

Birney³

et al. 2002

Nature

6,159

1,810

View full text Add to dashboard Cite

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

show abstract

VISTA: computational tools for comparative genomics

Frazer¹,

Pachter²,

Poliakov³

et al. 2004

Nucleic Acids Research

2,236

1,747

View full text Add to dashboard Cite

Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here, we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/vista/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, to submit their own sequences of interest to several VISTA servers for various types of comparative analysis and to obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kb interval on human chromosome 5 that encodes for the kinesin family member 3A (KIF3A) protein.

show abstract

Community structure and metabolism through reconstruction of microbial genomes from the environment

Tyson

Chapman

Hugenholtz

et al. 2004

Nature

2,105

1,674

View full text Add to dashboard Cite

Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level. The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme environment.

show abstract

ChIP-seq accurately predicts tissue-specific activity of enhancers

Visel

Blow

et al. 2009

Nature

1,519

1,547

View full text Add to dashboard Cite

A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover because they are scattered among the vast noncoding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here we present the results of chromatin immunoprecipitation with the enhancer-associated protein p300 followed by massively parallel sequencing, and map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain and limb tissue. We tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases demonstrated reproducible enhancer activity in the tissues that were predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities, and suggest that such data sets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.The initial sequencing of the human genome 1,2 , complemented by effective computational and experimental strategies for mammalian gene discovery 3,4 , has resulted in a virtually complete list of protein-coding sequences. In contrast, the genomic location and function of regulatory elements that orchestrate gene expression in the developing and adult body remain more obscure, hindering studies of their contribution to developmental processes and human disease. Evolutionary constraint of non-coding sequences can predict the location of enhancers in the genome 5-12 , but does not reveal when and where these enhancers are active in vivo. Furthermore, it has been suggested that a substantial proportion of regulatory elements is not sufficiently conserved to be detectable by comparative genomic methods [13][14][15][16] .

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Edward M. Rubin

Insights into the phylogeny and coding potential of microbial dark matter

Initial sequencing and comparative analysis of the mouse genome

VISTA: computational tools for comparative genomics

Community structure and metabolism through reconstruction of microbial genomes from the environment

ChIP-seq accurately predicts tissue-specific activity of enhancers

Contact Info

Product

Resources

About