Mass spectrometry (MS) is used to quantify the relative distribution of glycans attached to particular protein glycosylation sites (micro-heterogeneity) and evaluate the molar site occupancy (macro-heterogeneity) in glycoproteomics. However, the accuracy of MS for such quantitative measurements remains to be clarified. As a key step towards this goal, a panel of related tryptic peptides with and without complex, biantennary, disialylated N-glycans was chemically synthesised by solid-phase peptide synthesis. Peptides mimicking those resulting from enzymatic deglycosylation using PNGase F/A and endo D/F/H were synthetically produced, carrying aspartic acid and N-acetylglucosamine-linked asparagine residues, respectively, at the glycosylation site. The MS ionisation/detection strengths of these pure, well-defined and quantified compounds were investigated using various MS ionisation techniques and mass analysers (ESI-IT, ESI-Q-TOF, MALDI-TOF, ESI/MALDI-FT-ICR-MS). Depending on the ion source/mass analyser, glycopeptides carrying complex-type N-glycans exhibited clearly lower signal strengths (10-50% of an unglycosylated peptide) when equimolar amounts were analysed. Less ionisation/detection bias was observed when the glycopeptides were analysed by nano-ESI and medium-pressure MALDI. The position of the glycosylation site within the tryptic peptides also influenced the signal response, in particular if detected as singly or doubly charged signals. This is the first study to systematically and quantitatively address and determine MS glycopeptide ionisation/detection strengths to evaluate glycoprotein micro-heterogeneity and macro-heterogeneity by label-free approaches. These data form a much needed knowledge base for accurate quantitative glycoproteomics.
The biological and clinical relevance of glycosylation is becoming increasingly recognized, leading to a growing interest in large-scale clinical and population-based studies. In the past few years, several methods for high-throughput analysis of glycans have been developed, but thorough validation and standardization of these methods is required before significant resources are invested in large-scale studies. In this study, we compared liquid chromatography, capillary gel electrophoresis, and two MS methods for quantitative profiling of N-glycosylation of IgG in the same data set of 1201 individuals. To evaluate the accuracy of the four methods we then performed analysis of association with genetic polymorphisms and age. Chromatographic methods with either fluorescent or MS-detection yielded slightly stronger associations than MS-only and multiplexed capillary gel electrophoresis, but at the expense of lower levels of throughput. Advantages and disadvantages of each method were identified, which should inform the selection of the most appropriate method in future studies.
BackgroundElucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification.ResultsHere, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a “merged” database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields.ConclusionsThis study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.Electronic supplementary materialThe online version of this article (doi:10.1186/s40168-016-0196-8) contains supplementary material, which is available to authorized users.
The enormous challenges of mass spectrometry-based metaproteomics are primarily related to the analysis and interpretation of the acquired data. This includes reliable identification of mass spectra and the meaningful integration of taxonomic and functional meta-information from samples containing hundreds of unknown species. To ease these difficulties, we developed a dedicated software suite, the MetaProteomeAnalyzer, an intuitive open-source tool for metaproteomics data analysis and interpretation, which includes multiple search engines and the feature to decrease data redundancy by grouping protein hits to so-called meta-proteins. We also designed a graph database back-end for the MetaProteomeAnalyzer to allow seamless analysis of results. The functionality of the MetaProteomeAnalyzer is demonstrated using a sample of a microbial community taken from a biogas plant.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.