We study the optimal approximation of the solution of an operator equation A(u) = f by four types of mappings: (a) linear mappings of rank n; (b) n-term approximation with respect to a Riesz basis; (c) approximation based on linear information about the right-hand side f; (d) continuous mappings. We consider worst case errors, where f is an element of the unit ball of a Sobolev or Besov space B r q (L p ( )) and ⊂ R d is a bounded Lipschitz domain; the error is always measured in the H s -norm. The respective widths are the linear widths (or approximation numbers), the nonlinear widths, the Gelfand widths, and the manifold widths. As a technical tool, we also study the Bernstein numbers. Our main results are the following. If p 2 then the order of convergence is the same for all four classes of approximations. In particular, the best linear approximations are of the same order as the best nonlinear ones. The best linear approximation can be quite difficult to realize as a numerical algorithm since the optimal Galerkin space usually depends on the operator and on the shape of the domain . For p < 2 there is a difference, nonlinear approximations are better than linear ones. However, in this case, it turns out that linear information about the right-hand side f is again optimal. Our main theoretical tool is the best n-term approximation with respect to an optimal Riesz basis and related nonlinear widths. These general results are used to study the Poisson equation in a polygonal domain. It turns out that best n-term wavelet approximation is (almost) optimal. The main results of this paper are about approximation, not about computation. However, we also discuss consequences of the results for the numerical complexity of operator equations.