Aminoacyl-tRNA synthetases recognize tRNA anticodon and 3′ acceptor stem bases. Synthetase Urzymes acylate cognate tRNAs even without anticodon-binding domains, in keeping with the possibility that acceptor stem recognition preceded anticodon recognition. Representing tRNA identity elements with two bits per base, we show that the anticodon encodes the hydrophobicity of each amino acid side-chain as represented by its water-to-cyclohexane distribution coefficient, and this relationship holds true over the entire temperature range of liquid water. The acceptor stem codes preferentially for the surface area or size of each sidechain, as represented by its vapor-to-cyclohexane distribution coefficient. These orthogonal experimental properties are both necessary to account satisfactorily for the exposed surface area of amino acids in folded proteins. Moreover, the acceptor stem codes correctly for β-branched and carboxylic acid side-chains, whereas the anticodon codes for a wider range of such properties, but not for size or β-branching. These and other results suggest that genetic coding of 3D protein structures evolved in distinct stages, based initially on the size of the amino acid and later on its compatibility with globular folding in water.genetic code | aminoacyl-tRNA synthetases | urzymes | multivariate regression | protein folding T he genetic code is implemented by two distinct superfamilies of protein-RNA complexes between an aminoacyl-tRNA synthetase (aaRS) from one of two classes (1, 2) and its cognate tRNA. These recognition complexes effect the transfer of activated amino acids to the correct tRNA molecule, producing aminoacyl-tRNAs needed for protein synthesis by the ribosome. Errors in charging are rare (3, 4), and it is generally agreed that the low frequency of mischarging is based on synthetase recognition of specific identity elements in tRNA molecules (5). Many investigators (6-8) have observed that the codon table tends to reduce deleterious effects of point mutations (9) by assuring that they do minimal violence to the physical requirements of protein folding. One earlier study (10) identified a nonrandom tendency for hydrophilic side-chains to be coded by an A as the second codon base, hinting at more extensive relationships between the code and factors that direct protein folding.tRNA identity elements (5, 11) map to both the anticodon and acceptor stem at opposite ends of the L-shaped tRNA molecule and are distinct from binding determinants for elongation factorTu in the T-stem (12). Invariant cores of both classes of aaRS, termed urzymes (from the prefix ur-= primitive), lack anticodon-binding domains and cannot recognize the anticodon. However, they catalyze amino acid activation and acyl transfer with K M values approaching those of contemporary aaRS, consistent with their participation in early protein synthesis (13-16). The implied ability of ancestral aaRS to recognize tRNA acceptor stems, but not anticodons, is consistent with the suggestion that the earliest proteins were coded no...