Pooled, papain-solubilized HLA-A, -B, and -C antigens, derived from a large number of individuals and comprising several allelic forms, have been subjected to amino acid sequence determination. Despite the heterogeneity of the material, a main sequence representing all of the 273 amino acid residues could be established. The primary structure encompasses two immunoglobulin-like disulfide loops. The 32-microglobulin (4-6), and a 45,000-dalton, membrane-integrated, glycosylated heavy chain (7). The genetic polymorphism is in the amino acid sequence, but no information is available on the correlation between the variable amino acids and the serological specificities. Because immunological, chemical, and physical analyses strongly suggest that the overall conformation of the HLA antigens is similar regardless of the allelic form, it is conceivable that the amino acid variability occurs in discrete region(s) of the molecule that do not influence the overall conformation (8). This would be analogous to the situation for antibodies in which the hypervariable regions are clustered to form the antigenbinding site (9).frMicroglobulin is homologous to immunoglobulin constant domains (10). However, the amino acid sequence variability among the various HLA antigen specificities occurs in the heavy chain. Limited amino acid sequence information on murine (11) and hum4n (12, 13) transplantation antigen heavy chains generated claims for homologies with immunoglobulins. As more data became available it was evident that the HLA antigen heavy chains contained at least one region displaying obvious sequence homology with 32-microglobulin and the constant immunoglobulin domains (14-17).In a previous communication we outlined the amino acid sequence of about one-third of the HLA antigen heavy chain and showed that this portion of the molecule was similar to immunoglobulins (16). We have now completed determining The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U. S. C. §1734 (18). Because the starting material was heterogeneous, multiple amino acid residues were encountered in several positions. Here, the main sequence-i.e., the quantitatively dominating residue at each position-has been used in the computer analyses.In the statistical analyses, the use of amino acid residues occurring in smaller amounts than in the main residue did not change the overall results of the computer analyses.Statistical Analyses for Relatedness to Other Proteins. The amino acid sequence of the HLA antigen heavy chain was compared to known sequences, maintained in a protein sequence data file (19), by using the SEARCH program (20). An input matrix, the mutation data matrix (250 PAM) (21), and a matrix bias parameter, B = 2, were supplied to the program. To analyze the homology between the HLA antigen heavy chain and 32-microglobulin and the IgG variable and constant domains, the RELATE program was used (22, t). Program ALIGN ...