Selecting among a large set of variables those that influence most a response variable is an important problem in statistics. When the assumed regression model involves a nonparametric component, penalized regression techniques, and in particular P-splines, are among the commonly used methods. The aim of this paper is to provide a brief review of variable selection methods using P-splines. Starting from multiple linear regression models, with least-squares regression, and Ridge regression, we review standard methods that perform variable selection, such as LASSO, nonnegative garrote, the SCAD method, etc. We briefly discuss a general framework of penalization and regularization methods. Going toward more flexible regression models, with some nonparametric component(s), we discuss P-splines estimation. For some examples of flexible regression models, we then review a few variable selection methods using P-splines. A brief discussion on grouped regularization techniques and on a robust variable selection method is given. Furthermore, we mention key ingredients in Bayesian approaches, and end the paper by drawing the attention to several other issues in variable selection with P-splines. Throughout the paper we provide some illustrations.
LEAST-SQUARES AND RIDGE REGRESSION
In regression analysis the interest is to find out how, on average, a variable of interest Y is influenced by some explanatory variables X 1 , … , X d . In multiple linear regression the relationship between Y and (X 1 , … , X d ) is modeled via T the vector of unknown regression coefficients, and where denotes the error term. The superscript T denotes the transposed of a matrix or a vector.Often measurements on many potential influential factors X 1 , … , X d are available, and selecting among the d variables those that have a significant average influence on the response variable Y, is a major concern.Suppose that an i.i.is available. The ordinary least-squares method consists of solving the optimization problemand results into the least-squares estimatorof . We rewrite the Volume 7, January/February 2015