We consider the problem of deriving from experimental data an approximation of an unknown function, whose derivatives also approximate the unknown function derivatives. Solving this problem is useful, for instance, in the context of nonlinear system identification for obtaining models that are more accurate and reliable than the traditional ones, based on plain function approximation. Indeed, models identified by accounting for the derivatives can provide a better performance in several tasks, such as multi-step prediction, simulation, Nonlinear Model Predictive Control, and control design in general. In this paper, we propose a novel approach based on convex optimization, allowing us to solve the aforementioned identification problem. We develop an optimality analysis, showing that models derived using this approach enjoy suitable optimality properties in Sobolev spaces. The optimality analysis also leads to the derivation of tight uncertainty bounds on the unknown function and its derivatives. We demonstrate the effectiveness of the approach with two numerical examples. The first one is concerned with approximation of an univariate function, the second one with multi-step prediction of the Chua chaotic circuit.