The precise and large-scale identification of intact glycopeptides is a critical step in glycoproteomics. Owing to the complexity of glycosylation, the current overall throughput, data quality and accessibility of intact glycopeptide identification lack behind those in routine proteomic analyses. Here, we propose a workflow for the precise high-throughput identification of intact N-glycopeptides at the proteome scale using stepped-energy fragmentation and a dedicated search engine. pGlyco 2.0 conducts comprehensive quality control including false discovery rate evaluation at all three levels of matches to glycans, peptides and glycopeptides, improving the current level of accuracy of intact glycopeptide identification. The N-glycoproteome of samples metabolically labeled with 15N/13C were analyzed quantitatively and utilized to validate the glycopeptide identification, which could be used as a novel benchmark pipeline to compare different search engines. Finally, we report a large-scale glycoproteome dataset consisting of 10,009 distinct site-specific N-glycans on 1988 glycosylation sites from 955 glycoproteins in five mouse tissues.
Confident characterization of the microheterogeneity of protein glycosylation through identification of intact glycopeptides remains one of the toughest analytical challenges for glycoproteomics. Recently proposed mass spectrometry (MS)-based methods still have some defects such as lack of the false discovery rate (FDR) analysis for the glycan identification and lack of sufficient fragmentation information for the peptide identification. Here we proposed pGlyco, a novel pipeline for the identification of intact glycopeptides by using complementary MS techniques: 1) HCD-MS/MS followed by product-dependent CID-MS/MS was used to provide complementary fragments to identify the glycans, and a novel target-decoy method was developed to estimate the false discovery rate of the glycan identification; 2) data-dependent acquisition of MS3 for some most intense peaks of HCD-MS/MS was used to provide fragments to identify the peptide backbones. By integrating HCD-MS/MS, CID-MS/MS and MS3, intact glycopeptides could be confidently identified. With pGlyco, a standard glycoprotein mixture was analyzed in the Orbitrap Fusion, and 309 non-redundant intact glycopeptides were identified with detailed spectral information of both glycans and peptides.
Great advances have been made in mass spectrometric data interpretation for intact glycopeptide analysis. However, accurate identification of intact glycopeptides and modified saccharide units at the site-specific level and with fast speed remains challenging. Here, we present a glycan-first glycopeptide search engine, pGlyco3, to comprehensively analyze intact N- and O-glycopeptides, including glycopeptides with modified saccharide units. A glycan ion-indexing algorithm developed for glycan-first search makes pGlyco3 5–40 times faster than other glycoproteomic search engines without decreasing accuracy or sensitivity. By combining electron-based dissociation spectra, pGlyco3 integrates a dynamic programming-based algorithm termed pGlycoSite for site-specific glycan localization. Our evaluation shows that the site-specific glycan localization probabilities estimated by pGlycoSite are suitable to localize site-specific glycans. With pGlyco3, we confidently identified N-glycopeptides and O-mannose glycopeptides that were extensively modified by ammonia adducts in yeast samples. The freely available pGlyco3 is an accurate and flexible tool that can be used to identify glycopeptides and modified saccharide units.
Glycoproteomics is a powerful yet analytically challenging research tool. Software packages aiding the interpretation of complex glycopeptide tandem mass spectra have appeared, but their relative performance remains untested. Conducted through the HUPO Human Glycoproteomics Initiative, this community study, comprising both developers and users of glycoproteomics software, evaluates solutions for system-wide glycopeptide analysis. The same mass spectrometrybased glycoproteomics datasets from human serum were shared with participants and the relative team performance for N- and O-glycopeptide data analysis was comprehensively established by orthogonal performance tests. Although the results were variable, several high-performance glycoproteomics informatics strategies were identified. Deep analysis of the data revealed key performance-associated search parameters and led to recommendations for improved ‘high-coverage’ and ‘high-accuracy’ glycoproteomics search solutions. This study concludes that diverse software packages for comprehensive glycopeptide data analysis exist, points to several high-performance search strategies and specifies key variables that will guide future software developments and assist informatics decision-making in glycoproteomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.