We report a novel peak sorting method for the two-dimensional gas chromatography/time-of-flight mass spectrometry (GC×GC/TOF-MS) system. The objective of peak sorting is to recognize peaks from the same metabolite occurring in different samples from thousands of peaks detected in the analytical procedure. The developed algorithm is based on the fact that the chromatographic peaks for a given analyte have similar retention times in all of the chromatograms. Raw instrument data are first processed by ChromaTOF (Leco) software to provide the peak tables. Our algorithm achieves peak sorting by utilizing the first and second dimension retention times in the peak tables and the mass spectra generated during the process of electron impact ionization. The algorithm searches the peak tables for the peaks generated by the same type of metabolite using several search criteria. Our software also includes options to eliminate non-target peaks from the sorting results, e.g., peaks of contaminants. The developed software package has been tested using a mixture of standard metabolites and another mixture of standard metabolites spiked into human serum. Manual validation demonstrates high accuracy of peak sorting with this algorithm.
Proteomics is a still-evolving combination of technologies to describe and characterize all expressed proteins in a biological system. Because of upper limits on mass detection of mass spectrometers, the bottom-up approach is most widely employed in which tryptic peptides are quantified and identified from complex protein mixtures. Protein identification from tandem mass spectra is still a challenge in proteomics. Two approaches have been developed to identify proteins from tandem mass spectra, database searching and de novo sequencing. These approaches typically have positive identification rates of only ~10-20%, and exhibit high false positive identification rates. This review surveys existing algorithms developed for database searching and de novo sequencing, with a focus on recent developments for tandem mass spectrum quality assessment, peptide identification using annotated spectra libraries, statistical approaches to assess identification quality, and methods for constrained searches. We also review research comparing the performance of existing protein identification packages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.