Research in modern data-driven dynamical systems is typically focused on the three key challenges of high dimensionality, unknown dynamics and nonlinearity. The dynamic mode decomposition (DMD) has emerged as a cornerstone for modelling high-dimensional systems from data. However, the quality of the linear DMD model is known to be fragile with respect to strong nonlinearity, which contaminates the model estimate. By contrast, sparse identification of nonlinear dynamics learns fully nonlinear models, disambiguating the linear and nonlinear effects, but is restricted to low-dimensional systems. In this work, we present a kernel method that learns interpretable data-driven models for high-dimensional, nonlinear systems. Our method performs kernel regression on a sparse dictionary of samples that appreciably contribute to the dynamics. We show that this kernel method efficiently handles high-dimensional data and is flexible enough to incorporate partial knowledge of system physics. It is possible to recover the linear model contribution with this approach, thus separating the effects of the implicitly defined nonlinear terms. We demonstrate our approach on data from a range of nonlinear ordinary and partial differential equations. This framework can be used for many practical engineering tasks such as model order reduction, diagnostics, prediction, control and discovery of governing laws.