This study assesses the impact of incorporating an adaptive learning mechanism into an agent-based model simulating behavior on a university campus during a pandemic outbreak, with the particular case of the COVID-19 pandemic. Our model not only captures individual behavior, but also serves as a powerful tool for assessing the efficacy of geolocalized policies in addressing campus overcrowding and infections. The main objective is to demonstrate RL’s effectiveness in representing agent behavior and optimizing control policies through adaptive decision-making in response to evolving pandemic dynamics. By implementing RL, we identify different temporal patterns of overcrowding violations, shedding light on the complexity of human behavior within semi-enclosed environments. While we successfully reduce campus overcrowding, the study recognizes its limited impact on altering the pandemic’s course, underlining the importance of comprehensive epidemic control strategies. Our research contributes to the understanding of adaptive learning in complex systems and provides insights for shaping future public health policies in similar community settings. It emphasizes the significance of considering individual decision-making influenced by adaptive learning, implementing targeted interventions, and the role of geospatial elements in pandemic control. Future research directions include exploring various parameter settings and updating representations of the disease’s natural history to enhance the applicability of these findings. This study offers valuable insights into managing pandemics in community settings and highlights the need for multifaceted control strategies.