Thanks to their easy implementation via Radial Basis Functions (RBFs), meshfree kernel methods have been proved to be an effective tool for e.g. scattered data interpolation, PDE collocation, classification and regression tasks. Their accuracy might depend on a length scale hyperparameter, which is often tuned via cross validation schemes. Here we leverage approaches and tools from the machine learning community to introduce two-layered kernel machines, which generalize the classical RBF approaches that rely on a single hyperparameter. Indeed, the proposed learning strategy returns a kernel that is optimized not only in the Euclidean directions, but that further incorporates kernel rotations. The kernel optimization is shown to be robust by using recently improved calculations of cross validation scores. Finally, the use of greedy approaches, and specifically of the Vectorial Kernel Orthogonal Greedy Algorithm (VKOGA), allows us to construct an optimized basis that adapts to the data. Beyond a rigorous analysis on the convergence of the so-constructed two-Layered (2L)-VKOGA, its benefits are highlighted on both synthesized and real benchmark data sets.