AimsThere are substantial geographical variations in obesity prevalence. Sociodemographic and environmental determinants of health (SEDH), understood as upstream determinants of obesogenic behaviors, may be contributing to this disparity. Thus, we investigated high-risk SEDH potentially associated with adult obesity in American counties using machine learning (ML) techniques.Materials and methodsWe performed a cross-sectional analysis of county-level adult obesity prevalence (≥30 kg/m2) in the U.S. using data from the Diabetes Surveillance System 2017. We harvested 49 county-level SEDH factors that were used by a Classification and Regression Trees (CART) model to identify county-level clusters. CART was validated using a “hold-out” set of counties and variable importance was evaluated using Random Forest.ResultsOverall, we analyzed 2,752 counties in the U.S identifying a national median obesity prevalence of 34.1% (IQR, 30.2, 37.7). CART identified 11 clusters with a 60.8% relative increase in prevalence across the spectrum. Additionally, 7 key SEDH variables were identified by CART to guide the categorization of clusters, includingPhysically Inactive(%),Diabetes, Severe Housing Problems(%),Food Insecurity(%),Uninsured(%),Population over 65 years(%), andNon-Hispanic Black(%).ConclusionThere is significant county-level geographical variation in obesity prevalence in the United States which can in part be explained by complex SEDH factors. The use of ML techniques to analyze these factors can provide valuable insights into the importance of these upstream determinants of obesity and, therefore, aid in the development of geo-specific strategic interventions and optimize resource allocation to help battle the obesity pandemic.Article HighlightsWhy did we undertake this study?To improve the understanding of the association between complex sociodemographic and environmental determinants of health (SEDH) and obesity prevalence in the U.S.What is the specific question(s) we wanted to answer?What are the SEDH associated with obesity prevalence?What did we find?Seven key SEDH variables were identified by CART to guide the categorization of clusters, includingPhysically Inactive(%),Diabetes, Severe Housing Problems(%),Food Insecurity(%),Uninsured(%),Population over 65 years(%), andNon-Hispanic Black(%).What are the implications of our findings?Our study shows the importance of SEDH for the regional variation of obesity prevalence and aids in the development of geo-specific strategies to reduce disparities.