To elucidate cellular machinery on a global scale, we performed a multiple comparison of the recently available protein-protein interaction networks of Caenorhabditis elegans, Drosophila melanogaster, and Saccharomyces cerevisiae. This comparison integrated protein interaction and sequence information to reveal 71 network regions that were conserved across all three species and many exclusive to the metazoans. We used this conservation, and found statistically significant support for 4,645 previously undescribed protein functions and 2,609 previously undescribed protein interactions. We tested 60 interaction predictions for yeast by two-hybrid analysis, confirming approximately half of these. Significantly, many of the predicted functions and interactions would not have been identified from sequence similarity alone, demonstrating that network comparisons provide essential biological information beyond what is gleaned from the genome.comparative analysis ͉ multiple alignment ͉ protein network ͉ yeast two-hybrid A major challenge of postgenomic biology is to understand the complex networks of interacting genes, proteins, and small molecules that give rise to biological form and function. Advances in whole-genome approaches are now enabling us to characterize these networks systematically, by using procedures such as the two-hybrid assay (1) and protein coimmunoprecipitation (2) to screen for protein-protein interactions. To date, these technologies have generated large interaction networks for bacteria (3), yeast (4-7), and, recently, fruit fly (8) and nematode worm (9).The large amount of protein interaction data now available presents opportunities and challenges in understanding evolution and function. Such challenges involve assigning functional roles to interactions (10), separating true protein-protein interactions from false positives (11), and, ultimately, organizing large-scale interaction data into models of cellular signaling and regulatory machinery. As is often the case in biology, an approach based on evolutionary cross-species comparisons provides a valuable framework for addressing these challenges. However, although methods for comparing DNA and protein sequences have been a mainstay of bioinformatics over the past 30 years, development of similar tools at other levels of biological information, including protein interactions (12-14), metabolic networks (15-17), or gene expression data (18)(19)(20), is just beginning.Recently, we devised a method called PATHBLAST (13) for comparing the protein interaction networks of two species. Just as BLAST performs rapid pairwise alignment of protein sequences (21), PATHBLAST is based on efficient alignment of two protein networks to identify conserved network regions. Here, we extend this approach to present a computational framework for alignment and comparison of more than two protein networks. We apply this multiple network alignment strategy to compare the recently available protein networks for worm, fly, and yeast, and show that although any single net...
We implement a strategy for aligning two protein-protein interaction networks that combines interaction topology and protein sequence similarity to identify conserved interaction pathways and complexes. Using this approach we show that the protein-protein interaction networks of two distantly related species, Saccharomyces cerevisiae and Helicobacter pylori, harbor a large complement of evolutionarily conserved pathways, and that a large number of pathways appears to have duplicated and specialized within yeast. Analysis of these findings reveals many well characterized interaction pathways as well as many unanticipated pathways, the significance of which is reinforced by their presence in the networks of both species.
Unbiased next-generation sequencing (NGS) approaches enable comprehensive pathogen detection in the clinical microbiology laboratory and have numerous applications for public health surveillance, outbreak investigation, and the diagnosis of infectious diseases. However, practical deployment of the technology is hindered by the bioinformatics challenge of analyzing results accurately and in a clinically relevant timeframe. Here we describe SURPI (''sequence-based ultrarapid pathogen identification''), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. Deployable on both cloud-based and standalone servers, SURPI leverages two state-of-the-art aligners for accelerated analyses, SNAP and RAPSearch, which are as accurate as existing bioinformatics tools but orders of magnitude faster in performance. In fast mode, SURPI detects viruses and bacteria by scanning data sets of 7-500 million reads in 11 min to 5 h, while in comprehensive mode, all known microorganisms are identified, followed by de novo assembly and protein homology searches for divergent viruses in 50 min to 16 h. SURPI has also directly contributed to real-time microbial diagnosis in acutely ill patients, underscoring its potential key role in the development of unbiased NGS-based clinical assays in infectious diseases that demand rapid turnaround times.
Although metagenomics has been previously employed for pathogen discovery, its cost and complexity have prevented its use as a practical front-line diagnostic for unknown infectious diseases. Here we demonstrate the utility of two metagenomics-based strategies, a pan-viral microarray (Virochip) and deep sequencing, for the identification and characterization of 2009 pandemic H1N1 influenza A virus. Using nasopharyngeal swabs collected during the earliest stages of the pandemic in Mexico, Canada, and the United States (n = 17), the Virochip was able to detect a novel virus most closely related to swine influenza viruses without a priori information. Deep sequencing yielded reads corresponding to 2009 H1N1 influenza in each sample (percentage of aligned sequences corresponding to 2009 H1N1 ranging from 0.0011% to 10.9%), with up to 97% coverage of the influenza genome in one sample. Detection of 2009 H1N1 by deep sequencing was possible even at titers near the limits of detection for specific RT-PCR, and the percentage of sequence reads was linearly correlated with virus titer. Deep sequencing also provided insights into the upper respiratory microbiota and host gene expression in response to 2009 H1N1 infection. An unbiased analysis combining sequence data from all 17 outbreak samples revealed that 90% of the 2009 H1N1 genome could be assembled de novo without the use of any reference sequence, including assembly of several near full-length genomic segments. These results indicate that a streamlined metagenomics detection strategy can potentially replace the multiple conventional diagnostic tests required to investigate an outbreak of a novel pathogen, and provide a blueprint for comprehensive diagnosis of unexplained acute illnesses or outbreaks in clinical and public health settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.