Recent progress on improving theoretical sample efficiency of model-based reinforcement learning (RL), which exhibits superior sample complexity in practice, requires Gaussian and Lipschitz assumptions on the transition model, and additionally defines a posterior representation that grows unbounded with time. In this work, we propose a novel Kernelized Stein Discrepancy-based Posterior Sampling for RL algorithm (named KSRL) which extends model-based RL based upon posterior sampling (PSRL) in several ways: we (i) relax the need for any smoothness or Gaussian assumptions, allowing for complex mixture models; (ii) ensure it is applicable to large-scale training by incorporating a compression step such that the posterior consists of a Bayesian coreset of only statistically significant past state-action pairs; and (iii) develop a novel regret analysis of PSRL based upon integral probability metrics, which, under a smoothness condition on the constructed posterior, can be evaluated in closed form as the kernelized Stein discrepancy (KSD). Consequently, we are able to improve the O(H 3/2 d √ T ) regret of PSRL to O(H 3/2 √ T ), where d is the input dimension, H is the episode length, and T is the total number of episodes experienced, alleviating a linear dependence on d . Moreover, we theoretically establish a trade-off between regret rate with posterior representational complexity via introducing a compression budget parameter based on KSD, and establish a lower bound on the required complexity for consistency of the model. Experimentally, we observe that this approach is competitive with several state of the art RL methodologies, with substantive improvements in computation time. Experimentally, we observe that this approach is competitive with several state of the art RL methodologies, and can achieve up-to 50% reduction in wall clock time in some continuous control environments.