Comprehensive knowledge of proteome complexity is crucial to understanding cell function. Amino termini of yeast proteins were identified through peptide mass spectrometry on glutaraldehyde-treated cell lysates as well as a parallel assessment of publicly-deposited spectra. An unexpectedly large fraction of detected amino-terminal peptides (35%) mapped to translation initiation at AUG codons downstream of the annotated start codon. Many of the implicated genes have suboptimal sequence contexts for translation initiation near their annotated AUG, and their ribosome profiles show elevated tag densities consistent with translation initiation at downstream AUGs as well as their annotated AUGs. These data suggest that a significant fraction of the yeast proteome derives from initiation at downstream AUGs, increasing significantly the repertoire of encoded proteins and their potential functions and cellular localizations.
Peptide
mass spectrometry relies crucially on algorithms that match
peptides to spectra. We describe a method to evaluate the accuracy
of these algorithms based on the masses of parent proteins before
trypsin endoprotease digestion. Measurement of conformance to parent
proteins provides a score for comparison of the performances of different
algorithms as well as alternative parameter settings for a given algorithm.
Tracking of conformance scores for spectrum matches to proteins with
progressively lower expression levels revealed that conformance scores
are not uniform within data sets but are significantly lower for less
abundant proteins. Similarly peptides with lower algorithm peptide-spectrum
match scores have lower conformance. Although peptide mass spectrometry
data is typically filtered through decoy analysis to ensure a low
false discovery rate, this analysis confirms that the filtered data
should not be considered as having a uniform confidence. The analysis
suggests that use of different algorithms and multiple standardized
parameter settings of these algorithms can increase significantly
the numbers of peptides identified. This data set can be used as a
resource for future algorithm assessment.
In the framework of the European BIOTECH project for sequencing the Saccharomyces cerevisiae genome, we have determined the nucleotide sequence of the left part of the cosmid clone 232 and the cosmid clone 233 provided by F. Galibert (Rennes Cedex, France). We present here 33,099 base pairs of sequence derived from the left arm of chromosome X of strain S288C. This sequence reveals 17 open reading frames (ORFs) with more than 299 base pairs, including the published sequences for ARG3, LIGTR/LIG1, ORF2, ACT3 and SCP160. Two other ORFs showed similarity with S. cerevisiae genes: one with the CAN1 gene coding for an arginine permease, and one with genes encoding the family of transcriptional activators containing a fungal Zn(II)2-Cys6 binuclear cluster domain like that found in Ppr1p or Ga14p. Both putative proteins contain a leucine zipper motif, the Can1p homologue has 12 putative membrane-spanning domains and a putative alpha 2-SCB-alpha 2 binding site. In a diploid disruption mutant of ORF J0922 coding for the transcriptional activator homologue, no colonies appeared before 10 days after transformation and then grew slowly. In contrast, haploid disruption mutants showed a growth phenotype like wild-type cells. One ORF showed weak similarity to the rad4 gene product of Schizosaccharomyces pombe and is essential for yeast growth. Five ORFs showed similarity to putative genes on the right arm of chromosome XI of S. cerevisiae. Two of them have similarity to each other and belong to a family of extracellular proteins that groups mammalian SCP/Tpx-1, insects Ag3/Ag5, plants PR-1 and fungi Sc7/Sc14.(ABSTRACT TRUNCATED AT 250 WORDS)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.