Proteomics is the large-scale study of proteins, particularly their expression, structures and functions. This still-emerging combination of technologies aims to describe and characterize all expressed proteins in a biological system. Because of upper limits on mass detection of mass spectrometers, proteins are usually digested into peptides and the peptides are then separated, identified and quantified from this complex enzymatic digest. The problem in digesting proteins first and then analyzing the peptide cleavage fragments by mass spectrometry is that huge numbers of peptides are generated that overwhelm direct mass spectral analyses. The objective in the liquid chromatography approach to proteomics is to fractionate peptide mixtures to enable and maximize identification and quantification of the component peptides by mass spectrometry. This review will focus on existing multidimensional liquid chromatographic (MDLC) platforms developed for proteomics and their application in combination with other techniques such as stable isotope labeling. We also provide some perspectives on likely future developments.
A novel peak alignment algorithm using a distance and spectrum correlation optimization (DISCO) method has been developed for two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC/TOF-MS) based metabolomics. This algorithm uses the output of the instrument control software, ChromaTOF, as its input data. It detects and merges multiple peak entries of the same metabolite into one peak entry in each input peak list. After a z-score transformation of metabolite retention times, DISCO selects landmark peaks from all samples based on both two-dimensional retention times and mass spectrum similarity of fragment ions measured by Pearson's correlation coefficient. A local linear fitting method is employed in the original twodimensional retention time space to correct retention time shifts. A progressive retention time map searching method is used to align metabolite peaks in all samples together based on optimization of the Euclidean distance and mass spectrum similarity. The effectiveness of the DISCO algorithm is demonstrated using data sets acquired under different experiment conditions and a spiked-in experiment.
A method was developed to employ National Institute of Standards and Technology (NIST) 2008 retention index database information for molecular retention matching via constructing a set of empirical distribution functions (DFs) of the absolute retention index deviation to its mean value. The effects of different experimental parameters on the molecules’ retention indices were first assessed. The results show that the column class, the column type, and the data type have significant effects on the retention index values acquired on capillary columns. However, the normal alkane retention index (Inorm) with the ramp condition is similar to the linear retention index (IT), while the Inorm with the isothermal condition is similar to the Kováts retention index (I). As for the Inorm with the complex condition, these data should be treated as an additional group, because the mean Inorm value of the polar column is significantly different from the IT. Based on this analysis, nine DFs were generated from the grouped retention index data. The DF information was further implemented into a software program called iMatch. The performance of iMatch was evaluated using experimental data of a mixture of standards and metabolite extract of rat plasma with spiked-in standards. About 19% of the molecules identified by ChromaTOF were filtered out by iMatch from the identification list of electron ionization (EI) mass spectral matching, while all of the spiked-in standards were preserved. The analysis results demonstrate that using the retention index values, via constructing a set of DFs, can improve the spectral matching-based identifications by reducing a significant portion of false-positives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.