Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing "microbial dark matter" that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E+V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license.
In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align؉, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of topdown protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align؉ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align؉ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align؉ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set. Molecular & Cellular Proteomics 11: 10.1074/mcp.M111.008524, 1-13, 2012.In the past two decades, proteomics was dominated by bottom-up mass spectrometry that analyzes digested peptides rather than intact proteins. Bottom-up approaches, although powerful, do have limitations in analyzing protein species, e.g. various proteolytic forms of the same protein or various protein isoforms resulting from alternative splicing. Top-down mass spectrometry focuses on analyzing intact proteins and large peptides (1-10) and has advantages in localizing multiple post-translational modifications (PTMs) 1 in a coordinated fashion (e.g. combinatorial PTM code) and identifying multiple protein species (e.g. proteolytically processed protein species) (11). Until recently, most top-down studies were limited to single purified proteins (12-15). Topdown studies of protein mixtures were restricted by difficulties in separating and fragmenting intact proteins and a shortage of robust computational tools. In the last two years, because of advances in protein separation and top-down instrumentation, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples containing hundreds and even thousands of proteins (16 -21). Because algorithms for interpreting topdown spectra are still in their infancy, many recent developments include computational innovations in protein identification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.