A method to correlate the uninterpreted tandem mass spectra of peptides produced under low energy (lo-50 eV) collision conditions with amino acid sequences in the Genpept database has been developed. In this method the protein database is searched to identify linear amino acid sequences within a mass tolerance of * 1 u of the precursor ion molecular weight. A cross-correlation function is then used to provide a measurement of similarity between the mass-to-charge ratios for the fragment ions predicted from amino acid sequences obtained from the database and the fragment ions observed in the tandem mass spectrum. In general, a difference greater than 0.1 between the normalized cross-correlation functions of the first-and second-ranked search results indicates a successfol match between sequence and spectrum. Searches of species-specific protein databases with tandem mass spectra acquired from peptides obtained from the enzymatically digested total proteins of E. coli and S. cerevisiae cells allowed matchmg of the spectra to amino acid sequences within proteins of these organisms. The approach described in this manuscript provides a convenient method to interpret tandem mass spectra with known sequences in a protein database, fJ Am Sot Mass Spectrom 1994, 5, 976-989) A mino acid sequence analysis is often the initial step in characterizing a newly isolated protein.Conventional sequencing strategies employ chemical reagents to remove one amino acid at a time from the amino terminus followed by isolation and analysis of the released amino acid derivative [l, 21. Limitations in the chemical efficiency of the process prevents determination of the complete sequence of a protein from small quantities of sample. Partial sequence information, however, can be used to search a protein or nucleotide database to discover relationships to previously identified proteins or to determine if the protein sequence is novel 13, 41. Although sequence information may have been determined previously, the context in which the protein is identified may be relevant to the biological process under study [51. Another method to identify known protein sequences employs site-specific proteolysis followed by measurement of the mass-to-charge ratios of the pep tides by mass spectrometry. The set of observed peptide mass-to-charge ratios is then used to search a protein database to find a set of peptide masses predicted from enzymatic digestion of each protein in the database [6-101. Both chemical degradation and peptide mapping approaches require the use of fairly homogeneous samples to avoid ambiguity in assigning Address reprint requests to John R.