We develop an approach for feature elimination in statistical learning with kernel machines, based on recursive elimination of features. We present theoretical properties of this method and show that it is uniformly consistent in finding the correct feature space under certain generalized assumptions. We present four case studies to show that the assumptions are met in most practical situations and present simulation results to demonstrate performance of the proposed approach. * The authors are grateful to the anonymous reviewers, the associate editor, and the editor for their helpful suggestions and comments.Also define the restricted space F J as follows:Definition 1. Let J be a set of indices J ⊆ {1, 2, .., d}. Then for a given functional space F, define F J = {g : g = f • π J c , ∀f ∈ F }, where π J c is the projection map that takes element x ∈ R d and maps it to x J ∈ R d , by substituting elements in x indexed in the set J, by zero.Remark 4. Note that we can subsequently define the space X J = {π J c (x) : x ∈ X }. Thus the above formulation allows us to create lower dimensional versions of a given functional space F.We are now ready for our feature selection method. The risk-RFE algorithm, defined for the parameters {λ n , δ n } is given as:Algorithm 1 (risk-RFE). Start off with J ≡ [·] empty and let Z ≡ [1, 2, ..., d].STEP 1: In the k th iteration, choose feature i k ∈ Z \ J which minimize R reg,λn