Nonlinear soil behavior often exhibits a strong influence on surficial ground motions, and seismic site response models attempt to quantify these effects. However, site response models exhibit large uncertainties and can on occasion poorly replicate observed ground motions. In this study, nonlinear total-stress site response model predictions for 5626 ground motions at 114 KiKnet vertical seismometer arrays are calculated and compared to observed ground motions and predictions from linear and equivalent-linear analyses. Using this large database of onedimensional (1D) site response model predictions, a variety of statistical analyses are performed to quantify model bias and precision, and these statistical analyses are paired with physical insights into site and ground-motion behavior. As expected, there are deviations between the linear, equivalent-linear, and nonlinear site response constitutive models at large shear strains. When using cumulative Arias Intensity to assess the temporal evolution of model uncertainty, the equivalent-linear model is shown to have excessive bias early in the ground motion record, but this bias is obscured when the entire record is considered. Another less intuitive result is that on average, across the entire dataset, all models-linear, equivalent-linear, and nonlinear-tend to underpredict high-frequency ground motions in the aggregate. A number of physical hypotheses are explored to provide insights into these persistent biases. The use of a depth-dependent shear-wave velocity gradient, in particular, has a significant impact on the model bias, even more so than changing the constitutive model for dynamic soil behavior. Using an unprecedented number of sites and ground motions, the results of this study provide insights for the present limitations of, and potential improvements to, 1D site response model predictions.