A method is presented for locating protein antigenic determinants by analyzing amino acid sequences in order to find the point of greatest local hydrophilicity. This is accomplished by assigning each amino acid a numerical value (hydrophilicity value) and then repetitively averaging these values along the peptide chain. The point ofhighest local average hydrophilicity is invariably located in, or immediately adjacent to, an antigenic determinant. It was found that the prediction success rate depended on averaging group length, with hexapeptide averages yielding optimal results. The method was developed using 12 proteins for which extensive immunochemical analysis has been carried out and subsequently was used to predict antigenic determinants for the following proteins: hepatitis B surface antigen, influenza hemagglutinins, fowl plague virus hemagglutinin, human histocompatibility antigen HLA-B7, human interferons, Escherichia coli and cholera enterotoxins, ragweed allergens Ra3 and Ra5, and streptococcal M protein. The and indicate that they are frequently found on regions of a molecule that have an unusually high degree of exposure to solvent-i. e., regions which project into the medium (for reviews, see refs. 1 and 3). This, together with the fact that charged, hydrophilic amino acid side chains are common features of antigenic determinants, led us to investigate the possibility that at least some antigenic determinants might be associated with stretches of amino acid sequence that contain a large number of charged and polar residues and are lacking in large hydrophobic residues. A suitable means of methodically searching for such regions was found by combining a method like that of Chou and Fasman (5), in which numerical values for amino acids are repetitively averaged over the length ofa polypeptide chain, with a set ofvalues expressing the relative hydrophilicity of each amino acid. Suitable values were available in the solvent parameters assigned by Levitt (6), which are derivatives ofthe hydrophobicity values ofNozaki and Tanford (9).In Table 1 are listed the numerical values (hydrophilicity values) assigned to the 20 amino acids commonly found in proteins. In the first column, the values of Levitt (6) are listed, whereas the second column lists the values that were finally chosen for our hydrophilicity calculations. The values were generally retained as expressed by Levitt; however, changes in the values for proline, asparatic acid, and glutamic acid improve the prediction results, as explained later. Hydrophilicity analysis of a protein is carried out by the following method.Each amino acid in the sequence of the protein is assigned its hydrophilicity value, then these values are repetitively averaged down the length of the polypeptide chain, generating a series of local hydrophilicity values. The number of hydrophilicity values that are averaged at each repetition is arbitrary, and we chose groups of six for our initial studies because this is the approximate size ofan antigenic determinant...