Satellite communications (SatComs) systems are facing a massive increase in traffic demand. However, this increase is not uniform across the service area due to the uneven distribution of users and changes in traffic demand diurnal. This problem is addressed by using flexible payload architectures, which allow payload resources to be flexibly allocated to meet the traffic demand of each beam. While optimization-based radio resource management (RRM) has shown significant performance gains, its intense computational complexity limits its practical implementation in real systems. In this paper, we discuss the architecture, implementation and applications of Machine Learning (ML) for resource management in multibeam GEO satellite systems. We mainly focus on two systems, one with power, bandwidth, and/or beamwidth flexibility, and the second with time flexibility, i.e., beam hopping. We analyze and compare different ML techniques that have been proposed for these architectures, emphasizing the use of Supervised Learning (SL) and Reinforcement Learning (RL). To this end, we define whether training should be conducted online or offline based on the characteristics and requirements of each proposed ML technique and discuss the most appropriate system architecture and the advantages and disadvantages of each approach.