Mobile networks experience a tremendous increase in data volume and user density due to the massive number of coexisting users and devices. An efficient technique to alleviate this issue is to bring the data closer to the users by exploiting cache-aided edge nodes, such as fixed and mobile access points, and even user devices. Meanwhile, the fusion of machine learning and wireless networks offers new opportunities for network optimization when traditional optimization approaches fail or incur high complexity. Among the various machine learning categories, reinforcement learning provides autonomous operation without relying on large sets of historical data for training. In this survey, reinforcement learning-aided mobile edge caching solutions are presented and classified, based on the networking architecture and optimization target. As sixth generation (6G) networks will be characterized by high heterogeneity, fixed cellular, fog, cooperative, vehicular, and aerial networks are studied. The discussion of these works reveals that there exist reinforcement learning-aided caching schemes with varying complexity that can surpass the performance of conventional policy-based approaches. Finally, several open issues are presented, stimulating further interest in this important research field.INDEX TERMS 6G, edge caching, heterogeneous networks, machine learning, mobile edge networks, reinforcement learning.