Health effects are unknown for the vast majority of the >83,000 chemicals in commerce, creating challenges in balancing industrial needs against the complex landscape of health susceptibilities and exposures. The National Health and Nutritional Examination Survey (NHANES), a large-scale survey aimed at determining the prevalence and risk factors of major diseases, is increasingly used to postulate relationships between chemicals and adverse health effects in the U.S. population. The interpretation of these studies is complicated, however, by the ad hoc data mining approaches typically employed.Here we describe the use of frequent itemset mining for identifying exposure ⇒ health associations in data from the NHANES 2005-2006 cycle. From 9,440 dichotomized samples, 983 two-itemset rules were generated describing associations between markers of health and environmental exposure (lift >1, response threshold >97.5th quantile). A case study using parathyroid hormone levels to develop exposure-health effect hypotheses is presented. This case study demonstrates how association rules can be used in data mining to facilitate hypothesis development and improve traditional regression models by identification of potentially confounding variables even in the presence of missing information. Our approach is designed to enable more effec- * This project was supported in part by an appointment to the Internship/Research Participation Program at the Office of Research and Development, U.S. Environmental Protection Agency, administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and EPA. We thank Drs Lyle Burgoon, Jennifer Richmond-Bryant, and Jon Sobus for helpful discussion and review of early drafts of the paper. The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency. tive knowledge discovery of potential health impacts of environmental chemicals by facilitating comprehensive data mining and meta-analysis of the NHANES dataset. Long-term, our representation of the information allows for integration with other disparate data, such as known biological pathways, to address the current data gaps.