A model that fully describes the response properties of visual neurons must be able to predict their activity during natural vision. While many models have been proposed for the visual system, few have ever been tested against this criterion. To address this issue, we have developed a general framework for fitting and validating nonlinear models of visual neurons using natural visual stimuli. Our approach derives from linear spatiotemporal receptive field (STRF) analysis, which has frequently been used to study the visual system. However, prior to the linear filtering stage typical of STRFs, a linearizing transformation is applied to the stimulus to account for nonlinear response properties. We used this approach to compare two models for neurons in primary visual cortex: a nonlinear Fourier power model, which accounts for spatial phase invariant tuning, and a traditional linear model. We characterized prediction accuracy in terms of the total explainable variance, given intrinsic experimental noise. On average, Fourier power STRFs predicted 40% of explainable variance while linear STRFs were able to predict only 21% of explainable variance. The performance of the Fourier power model provides a benchmark for evaluating more sophisticated models in the future.