Direct ridership models can predict station-level urban rail transit ridership. Previous research indicates that the direct modeling of urban rail transit ridership uses different coverage overlapping area processing methods (such as naive method or Thiessen polygons), area analysis units (such as census block group and census tract), and various regression models (such as linear regression and negative binomial regression). However, the selection of these methods and models seems arbitrary. The objective of this research is to suggest methods of station-level urban rail transit ridership model selection and evaluate the impact of this selection on ridership model results and prediction accuracy. Urban rail transit ridership data in 2010 were collected from five cities: New York, San Francisco, Chicago, Philadelphia, and Boston. Using the built environment characteristics as the independent variables and station-level ridership as the dependent variable, an analysis was conducted to examine the differences in the model performance in ridership prediction. Our results show that a large overlap of circular coverage areas will greatly affect the accuracy of models. The equal division method increases model accuracy significantly. Most models show that the generalized additive models have lower mean absolute percentage errors (MAPE) and higher adjusted
R
2
values. By comparison, the Akaike information criterion (AIC) values of the negative binomial models are lower. The influence of different basic spatial analysis unit on the model results is marginal. Therefore, the selection of basic area unit can use existing data. In terms of model selection, advanced models seem to perform better than the linear regression models.