We present a computational pipeline for the quantification of peptides and proteins in label-free LC-MS/MS data sets. The pipeline is composed of tools from the OpenMS software framework and is applicable to the processing of large experiments (50+ samples). We describe several enhancements that we have introduced to OpenMS to realize the implementation of this pipeline. They include new algorithms for centroiding of raw data, for feature detection, for the alignment of multiple related measurements, and a new tool for the calculation of peptide and protein abundances. Where possible, we compare the performance of the new algorithms to that of their established counterparts in OpenMS. We validate the pipeline on the basis of two small data sets that provide ground truths for the quantification. There, we also compare our results to those of MaxQuant and Progenesis LC-MS, two popular alternatives for the analysis of label-free data. We then show how our software can be applied to a large heterogeneous data set of 58 LC-MS/MS runs.
BackgroundModern data generation techniques used in distributed systems biology research projects often create datasets of enormous size and diversity. We argue that in order to overcome the challenge of managing those large quantitative datasets and maximise the biological information extracted from them, a sound information system is required. Ease of integration with data analysis pipelines and other computational tools is a key requirement for it.ResultsWe have developed openBIS, an open source software framework for constructing user-friendly, scalable and powerful information systems for data and metadata acquired in biological experiments. openBIS enables users to collect, integrate, share, publish data and to connect to data processing pipelines. This framework can be extended and has been customized for different data types acquired by a range of technologies.ConclusionsopenBIS is currently being used by several SystemsX.ch and EU projects applying mass spectrometric measurements of metabolites and proteins, High Content Screening, or Next Generation Sequencing technologies. The attributes that make it interesting to a large research community involved in systems biology projects include versatility, simplicity in deployment, scalability to very large data, flexibility to handle any biological data type and extensibility to the needs of any research domain.
Background:The human pathogen Streptococcus pyogenes adapts to vascular leakage at the site of infection. Results: S. pyogenes modifies the production of 213 in plasma determined using quantitative proteomics.
Conclusion:The results clarify the function of HSA-binding proteins in S. pyogenes. Significance: Our data demonstrates the power of the quantitative mass spectrometry strategy to investigate bacterial adaptation to a given environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.