We consider the question of how to design proteins. How can we find "good" amino acid sequences (D) that fold to a desired "target" structure as a native conformation of lowest accessible free energy and (ii) that will not simultaneously fold to many other conformations of the same free energy? Current protein designs often focus on helix propensities and turns. We focus here on designing the hydrophobicity. For a model of self-avoiding hydrophobic/polar chains on two-dimensional square lattices, geometric proofs and exhaustive enumerations show the following results. (i) The strategy hydrophobic residues inside/polar residues outside is not optimal. Placement of additional hydrophobic residues on the surface is often necessary. (u) To avoid unwanted conformations, the designed sequence must have neither too many nor too few hydrophobic residues. (iii) The computational complexity of inverse folding appears to be in a different class than folding: unlike the folding problem, the design problem does not scale exponentially with chain length. Some design strategies, described here for the lattice model, produce good sequences and scale only linearly with chain length.Recently there have been major advances in protein design [i.e., in the design of amino acid sequences that will fold to desired "target" native conformations (1-11)]. However, the following several hurdles remain to be overcome. (i) So far, most designed proteins have only simple symmetries-4-helix bundles and all-sheet conformations, for example (2-4, 9, 10).(ii) Most of the "rational" aspects of design currently focus on the interactions among the "connected" neighbors [i.e., on the intrinsic propensities of monomeric and dimeric amino acids to form helices and turns (12-16)]. However, major forces of folding are due to hydrophobic and other "nonlocal" interactions [i.e., among monomers far apart in the sequence (17-19)]. The main design principle currently used for them is hydrophobic (H) residues inside/polar (P) residues outside. Nevertheless, many real proteins have exposed nonpolar and buried polar units (20,21), and some single-site mutations contradict the hydrophobic residues inside/pol~ar residues outside principle (22). Are there more subtle principles that are important? (iii) A major design problem has been how to avoid simultaneous folding to wrong conformations; designed sequences sometimes appear to fold to "gemiche" states, involving multiple or wrong native structures (ref. 4 and T. Handel, personal communication). It is not known how to control the multiplicity of stable structures (i.e., how to design the desired structure uniquely into the sequence). The problem of sequence design is the "inverse protein folding" problem (23)(24)(25). Whereas the input of a protein folding algorithm would be an amino acid sequence and the output would be a native structure, the input for an inverse folding algorithm would be a desired native structure and the output would be a sequence that will fold to it. Can heuristic rules be f...