The global trend toward urbanization has spurred the widespread adoption of transit-oriented development (TOD). While previous research has extensively explored the relationship between land use and TOD ridership, much of it has focused on linear associations at a singular scale. Leveraging recent advancements in nonlinear modeling and the accessibility of open-source data, this study employs a comprehensive two-step methodology. Firstly, K-means clustering algorithm categorizes TOD sites in Shenzhen into three distinct clusters, providing a site-based understanding of their characteristics. Subsequently, a Light Gradient Boosting Machine (LightGBM) classification model, complemented by SHapley Additive exPlanations (SHAP) values for interpretation, quantitatively evaluates the influence of mixed land use on TOD ridership across various catchment areas. As for the findings, we discover that land-use factors have different effects on TOD site patronage at different buffer radii and delve into the intricacies of these effects. Further results reveal non-linear relationships with varying degrees of positivity and negativity. For instance, residents and health sites positively impact patronage across all buffer radii, while certain commercial land uses exhibit a negative influence. The study demonstrates how the importance of different land-use structures varies across these clusters, shedding light on the nuanced impacts of land use on TOD catchment areas. Our research optimizes land-use mixes based on predominant cluster characteristics by offering actionable recommendations for urban managers.