Feature selection and feature transformation, the two main ways to reduce dimensionality, are often presented separately. In this paper, a feature selection method is proposed by combining the popular transformation-based dimensionality reduction method linear discriminant analysis (LDA) and sparsity regularization. We impose row sparsity on the transformation matrix of LDA through l2,1-norm regularization to achieve feature selection, and the resultant formulation optimizes for selecting the most discriminative features and removing the redundant ones simultaneously. The formulation is extended to the l2,p-norm regularized case, which is more likely to offer better sparsity when 0 < p < 1. Thus, the formulation is a better approximation to the feature selection problem. An efficient algorithm is developed to solve the l2,p-norm-based optimization problem and it is proved that the algorithm converges when 0 < p ≤ 2. Systematical experiments are conducted to understand the work of the proposed method. Promising experimental results on various types of real-world data sets demonstrate the effectiveness of our algorithm.