Approximately 40-60 % of group A streptococcal (GAS) isolates are capable of opacifying sera, due to the expression of the sof (serum opacity factor) gene. The emm (M protein gene) and sof 5' sequences were obtained from a diverse set of GAS reference strains and clinical isolates, and correlated with M serotyping and anti-opacity-factor testing results. Attempts to amplify sof from strains with M serotypes or emm types historically associated with the opacity-factor-negative phenotype were negative, except for emm12 strains, which were found to contain a highly conserved sof sequence. There was a strong correlation of certain M serotypes with specific emm sequences regardless of strain background, and likewise a strong association of specific anti-opacity-factor (AOF) types to sof gene sequence types. In several examples, M type identity, or partial identity shared between strains with differing emm types, was correlated with short, highly conserved 5' emm sequences likely to encode M-type-specific epitopes. Additionally, each of three pairs of historically distinct M type reference strains found to share the same 5' emm sequence, were also found to share M serotype specificity. Based upon sof sequence comparisons between strains of the same and of differing AOF types, an approximately 450 residue domain was determined likely to contain key epitopes required for AOF type specificity. Analysis of two Sof sequences that were not highly homologous, yet shared a common AOF type, further implicated a 107 aa portion of this 450-residue domain in putatively containing AOF-specific epitopes. Taken together, the serological data suggest that AOF-specific epitopes for all Sof proteins may reside within a region corresponding to this 107-residue sequence. The presence of specific, hypervariable emm /sof pairs within multiple isolates appears likely to be a reliable indicator of their overall genetic relatedness, and to be very useful for accurate subtyping of GAS isolates by an approach that has relevance to decades of past M-type-based epidemiological data.