Linear operations can only partially exploit the statistical redundancies of natural scenes, and nonlinear operations are ubiquitous in visual cortex. However, neither the detailed function of the nonlinearities nor the higher-order image statistics are yet fully understood. We suggest that these complicated issues can not be tackled by one single approach, but require a range of methods, and the understanding of the crosslinks between the results. We consider three basic approaches: (i) State space descriptions can theoretically provide complete information about statistical properties and nonlinear operations, but their practical usage is confined to very low-dimensional settings. We discuss the use of representation-related state-space coordinates (multivariate wavelet statistics) and of basic nonlinear coordinate transformations of the state space (e.g., a polar transform). (ii) Indirect methods, like unsupervised learning in multi-layer networks, provide complete optimization results, but no direct information on the statistical properties, and no simple model structures. (iii) Approximation by lower-order terms of power-series expansions is a classical strategy that has not yet received broad attention. On the statistical side, this approximation amounts to cumulant functions and higher-order spectra (polyspectra), on the processing side to Volterra Wiener systems. In this context we suggest that an important concept for the understanding of natural scene statistics, of nonlinear neurons, and of biological pattern recognition can be found in AND-like combinations of frequency components. We investigate how the different approaches can be related to each other, how they can contribute to the understanding of cortical nonlinearities such as complex cells, cortical gain control, end-stopping and other extraclassical receptive field properties, and how we can obtain a nonlinear perspective on overcomplete representations and invariant coding in visual cortex.