cTrans is a comprehensive utility used to generate polypeptide databases from cDNA sequences. The goal is achieved through integrating four main functions, including retrieving sequences of species of interest from the downloaded packages from dbEST of GenBank, format conversion, checking and deleting vector and adaptor contamination, and translating the cDNA sequences in all six frames and selecting specific translations for database construction in a user-defined length threshold. In addition, this utility is also applicable to cDNA sequences produced by users themselves. Keywords: Bioinformatics / Databases / Expressed sequence tags / Peptide mass fingerprinting Proteomics 2007, 7, 177-179 177Determining the identity of protein spots from 2-DE of protein extracts is one of the most important procedures of proteomics analysis. The popular method for such a purpose is to generate PMF data of the individual spots using MALDI-TOF MS and/or ESI-MS and then compare the PMF data with the predicted peptide mass information of known proteins in polypeptide/protein databases. The protein identification gains greater confidence when the peptide fragmentation data from the tandem, or MS/MS, spectrum is combined in the database search. Even though there are quite a few polypeptide/protein databases available, such as PRF/ SEQDB, Swiss-Prot, TrEMBL and NCBInr, only the proteins or predicted polypeptides from comprehensively studied or model species are relatively well represented. Thus, identification of proteins from less-represented species is largely based on the homology or conservation of the specific sample protein with known proteins in the cross-species databases. As polymorphism might exist between different species at the amino acid level even for the most conserved proteins, the cross-species match approach is compromised when the mass spectra data are affected by the polymorphism [1,2]. Researchers have proposed to query the EST raw sequence databases or the EST translation databases as an alternative in species without full genome sequence information to overcome this shortcoming [1,[3][4][5][6]. MS fingerprinting searches against EST raw sequences can be performed using MASCOT [7], ProFound [8] or Protein-Prospector [9] by directly specifying certain parameters; and identification using MS/MS data against EST raw sequences can be performed using search programs such as SEQUEST (http://fields. scripps.edu/sequest/), MASCOT, PepFrag (http:// 129.85.19.192/prowl/pepfragch.html), and MS-Tag (http:// prospector.ucsf.edu/ucsfhtml4.0/mstagfd.htm). However, the efficiency of the EST database-based query suffers from out-of-frame translations.Proteome research could be facilitated by the availability of quality-controlled specialized EST translation databases. A few species-specific EST translation databases can be found in some commercial organizations or academic institutions. However, they are often not freely downloadable and/or require specific search engines for protein identification. Species-specific EST...