Over the past years, the recommender systems community invented several novel approaches that reached better and better prediction accuracy. Sequential recommendation, such as music recommendation, has seen large improvements from neural network-based models such as recurrent neural networks or transformers. When no sequential information is available or not relevant, such as for book, movie, or product recommendation, however, the classic k-nearest neighbor algorithm appears to remain competitive, even when compared to much more sophisticated methods. In this paper, we attempt to explain the inner workings of the nearest neighbor using probabilistic tools, treating similarity as conditional probability and presenting a novel model for explaining and removing popularity bias. First, we provide a probabilistic formulation of similarity and the classic prediction formula. Second, by modeling user behavior as a combination of personal preference and global influence, we are able to explain the presence of popularity bias in the predictions. Finally, we utilize Bayesian inference to construct a theoretically grounded variant of the widely used inverse frequency scaling, which we use to mitigate the effect of popularity bias in the predictions. By replacing the formerly ad hoc choices of nearest neighbor with probabilistically founded counterparts, we are able to improve prediction accuracy over a variety of data sets and gain an increased understanding of the theory behind the method.