Counting individuals in highly crowded environments, characterized by thousands of people, has garnered significant attention in recent years, due to the high number of vertical markets wherein such algorithms can prove beneficial, ranging from smart city and transportation to retail sectors, among others. Within this context, in this paper we introduce a novel training methodology tailored for estimating the number of people, ensuring precise counting accuracy in both moderately and highly crowded scenarios. The proposed approach exploits a formulation of the problem based on point detection, where each point represents an individual’s head. Our innovative contributions center around the designing of a novel training strategy employing Curriculum Learning (CL), which aims to replicate the gradual learning process observed in human cognition, training on simpler tasks at the beginning and tackling more complex tasks as the training evolves. In order to evaluate the complexity of each sample image, we propose a novel indicator taking into account both the number of people and their distribution within the image. The experimentation phase encompassed 18 publicly available datasets; the obtained results validate the effectiveness of the proposed approach, surpassing the baseline state-of-the-art point detection by 71% and 70% in terms of Mean Absolute Error (MAE) and Mean Squared Error (MSE), respectively