Studies have used medical record discharge data as coded by the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) to estimate pneumococcal pneumonia incidence and vaccine efficacy. However, the accuracy of coding data to identify laboratory-confirmed pneumococcal pneumonia is not known. With the use of information collected in Ohio for a community-based pneumonia incidence study, the authors calculated the sensitivities, specificities, positive predictive values (PPV), and negative predictive values (NPV) of specific codes for pneumococcal pneumonia among hospitalized patients with community-acquired pneumonia. Sensitivities of the most common ICD-9-CM codes listed in the first five positions for patients with laboratory-confirmed pneumococcal pneumonia were 58.3% (code 481.0, pneumococcal pneumonia), 20.4% (38.0, streptococcal septicemia), 19.2% (38.2, pneumococcal septicemia), 15.0% (518.81, respiratory failure), 14.2% (486.0, pneumonia, organism unspecified), and 11.3% (482.3, streptococcal pneumonia). Using the first five listed ICD-9-CM codes rather than just the first listed code increased sensitivity without causing substantial change in specificity, PPV, and NPV. Sensitivity, PPV, and NPV of individual and groups of codes varied with different case definitions of pneumococcal pneumonia. Incidence and vaccine efficacy studies with the ability to validate diagnoses by medical chart review can use a combination of many ICD-9-CM codes to maximize sensitivity. However, without the ability to review medical charts, researchers must carefully decide which codes would best suit their studies.