Sparse modeling for signal processing and machine learning, in general, has been at the focus of scientific research for over two decades. Among others, supervised sparsity-aware learning comprises two major paths paved by: a) discriminative methods that establish direct input-output mapping based on a regularized cost function optimization, and b) generative methods that learn the underlying distributions.The latter, more widely known as Bayesian methods, enable uncertainty evaluation with respect to the performed predictions. Furthermore, they can better exploit related prior information and also, in principle, can naturally introduce robustness into the model, due to their unique capacity to marginalize out uncertainties related to the parameter estimates. Moreover, hyper-parameters (tuning parameters) associated with the adopted priors, which correspond to cost function regularizers, can be learnt via the training data and not via costly cross-validation techniques, which is, in general, the case with the discriminative methods. To implement sparsity-aware learning, the crucial point lies in the choice of the function regularizer for discriminative methods and the choice of the prior distribution for Bayesian learning. Over the last decade or so, due to the intense research on deep learning, emphasis has been put on discriminative techniques. However, a come back of Bayesian methods is taking place that sheds new light on the design of deep neural networks, which also establish firm links with Bayesian models, such Lei Cheng and Feng Yin contribute equally.