In this paper, the problem of self-organizing, correlation-aware clustering is studied for a dense network of machine-type devices (MTDs) deployed over a cellular network. In dense machine-tomachine networks, MTDs are typically located within close proximity and will gather correlated data, and, thus, clustering MTDs based on data correlation will lead to a decrease in the number of redundant bits transmitted to the base station. To analyze this clustering problem, a novel utility function that captures the average MTD transmission power per cluster is derived, as a function of the MTD location, cluster size, and inter-cluster interference. Then, the clustering problem is formulated as an evolutionary game, which models the interactions among the massive number of MTDs, in order to decrease MTD transmission power. To solve this game, a distributed algorithm is proposed to allow the infinite number of MTDs to autonomously form clusters. It is shown that the proposed distributed algorithm converges to an evolutionary stable strategy (ESS), that is robust to a small portion of MTDs deviating from the stable cluster formation at convergence. The maximum fraction of MTDs that can deviate from the ESS, while still maintaining a stable cluster formation is derived. Simulation results show that the proposed approach can effectively cluster MTDs with highly correlated data, which, in turn, enables those MTDs to eliminate a large number of redundant bits. The results show that, on average, using the proposed approach yields reductions of up to 23.4% and 9.6% in terms of the transmit power per cluster, compared to forming clusters with the maximum possible size and uniformly selecting a cluster size, respectively. A preliminary version of this work appeared in IEEE GLOBECOM 2017 [1]. 2 I. INTRODUCTION Machine-to-machine (M2M) communications is an important component of the emerging Internet of Things (IoT) system, as it enables advanced networked applications such as smart home technologies, smart grid, healthcare, drone systems, manufacturing systems, and surveillance [2]-[7]. Within an M2M network, a massive number of machine-type devices (MTDs) will be densely deployed over wireless cellular networks [8]. An MTD can be a sensor, actuator, or smart meter whose typical role is to sense or measure an environment, and transmit the collected data to cellular base stations (BSs). MTDs enable real-time monitoring and control of any physical environment, without direct human involvement, thus making processes more efficient and improving human welfare [4], [9]. Since the number of MTDs is expected to be massive and much larger than the number of cellular-type devices (CTDs), it is expected that a massive-scale MTD deployment will lead to quality-of-service (QoS) degradation, increased traffic and signaling, increased latency, and reduced reliability [4]. Therefore, deploying MTDs over cellular networks faces many challenges ranging from network modeling to resource management, massive-scale, access, and MTD clustering as mention...