Abstract:The amino acid composition of sequences and structural attributes (α-helices, β-sheets) of C-and N-terminal fragments (50 amino acids) were compared to annotated (SWISS-PROT/ TrEMBL) type I (20 sequences) and type III (22 sequences) secreted proteins of Gramnegative bacteria. The discriminant analysis together with the stepwise forward and backward selection of variables revealed the frequencies of the residues Arg, Glu, Gly, Ile, Met, Pro, Ser, Tyr, Val as a set of strong (1-P< 0.001) predictor variables to discriminate between the sequences of type I and type III secreted proteins with a cross-validated accuracy of 98.6-100 % . The internal and external validity of discriminant analysis was confirmed by multiple (15 repeats) test-retest procedures using a randomly split original set of proteins; this validation method demonstrated an accuracy of 100 % for 191 non-selected (retest) sequences. The discriminant analysis was also applied using selected variables from the propensities for β-sheets and polarity of C-terminal fragments. This approach produced the next highest and comparable cross-validated classification accuracy for randomly selected and retest proteins (85.4-86.0 % and 82.4-84.5 %, respectively). The proposed sets of predictor variables could be used to assess the compatibility between secretion substrates and secretion pathways of Gram-negative bacteria by means of discriminant analysis.