Performing accurate large eddy simulations in compressible, turbulent magnetohydrodynamics is more challenging than in non-magnetized fluids due to the complex interplay between kinetic, magnetic and internal energy at different scales. Here we extend the sub-grid-scale gradient model, so far used in the momentum and induction equations, to account also for the unresolved scales in the energy evolution equation of a compressible ideal MHD fluid with a generic equation of state. We assess the model by considering box simulations of the turbulence triggered across a shear layer by the Kelvin-Helmholtz instability, testing cases where the small-scale dynamics cannot be fully captured by the resolution considered, such that the efficiency of the simulated dynamo effect depends on the resolution employed. This lack of numerical convergence is actually a currently common issue in several astrophysical problems, where the integral and fastest-growing-instability scales are too far apart to be fully covered numerically. We perform a-priori and a-posteriori tests of the extended gradient model. In the former, we find that, for many different initial conditions and resolutions, the gradient model outperforms other commonly used models in terms of correlation with the residuals coming from the filtering of a high-resolution run. In the second test, we show how a low-resolution run with the gradient model is able to quantitatively reproduce the evolution of the magnetic energy (the integrated value and the spectral distribution) coming from higher-resolution runs. This extension is the first step towards the implementation in relativistic magnetohydrodynamics.