Surrogate models are used to alleviate the computational burden in engineering tasks, which require the repeated evaluation of computationally demanding models of physical systems, such as the efficient propagation of uncertainties.For models that show a strongly non-linear dependence on their input parameters, standard surrogate techniques, such as polynomial chaos expansion, are not sufficient to obtain an accurate representation of the original model response.It has been shown that for models with discontinuities or rational dependencies, for example, frequency response functions of dynamic systems, the use of a rational (Padé) approximation can significantly improve the approximation accuracy. In order to avoid overfitting issues in previously proposed standard least squares approaches, we introduce a sparse Bayesian learning approach to estimate the coefficients of the rational approximation. Therein the linearity in the numerator polynomial coefficients is exploited and the denominator polynomial coefficients as well as the problem hyperparameters are determined through type-II-maximum likelihood estimation. We apply a quasi-Newton gradient-descent algorithm to find the optimal denominator coefficients and derive the required gradients through application of CR-calculus. The method is applied to the frequency response functions of an algebraic frame structure model as well as that of an orthotropic plate finite element model.