Massive machine type communication (mMTC) has been identified as an important use case in Beyond 5G networks and future massive Internet of Things (IoT). However, for the massive multiple access in mMTC, there is a serious access preamble collision problem if the conventional 4-step random access (RA) scheme is employed. Consequently, a range of grant-free (GF) RA schemes were proposed.Nevertheless, if the number of cellular users (devices) significantly increases, both the energy and spectrum efficiency of the existing GF schemes still rapidly degrade owing to the much longer preambles required. In order to overcome this dilemma, a layered grouping strategy is proposed, where the cellular users are firstly divided into clusters based on their geographical locations, and then the users of the same cluster autonomously join in different groups by using optimum energy consumption (Opt-EC) based K-means algorithm. With this new layered cellular architecture, the RA process is divided into cluster load estimation phase and active group detection phase. Based on the state evolution theory of approximated message passing algorithm, a tight lower bound on the minimum preamble length for achieving a certain detection accuracy is derived. Benefiting from the cluster load estimation, a dynamic preamble selection (DPS) strategy is invoked in the second phase, resulting the required preambles with minimum length. As evidenced in our simulation results, this two-phase DPS aided RA strategy results in a significant performance improvement.