To achieve the computational goal of navigating both natural and mental spaces, the brain adopts a hexagonal metric of grid cells in the entorhinal cortex (ERC) to chart the spaces. However, little is known about how and why the hexagonal metric emerges. Here we designed an object matching task where participants adjusted two parts of object variants to match their prototype in fMRI scanner. Unbeknownst to the participants, the object variants were arranged on a ring centering on the prototype in an object space constructed by the object parts. When the participants navigated from a location in the ring to the center in the space, we observed hexagonal signals in the ERC, and more importantly, a spatial rhythm of 3Hz in the hippocampus (HPC). Accordingly, we built a computational model depicting the spatial rhythm of 3Hz as scaffolds to transform discrete spatial locations into structured states of cognitive map that allows continuous updates of location estimates with each movement made. The formal proof shows that to support the cognitive map in the HPC, the spatial input from the ERC must be in a hexagonal pattern. To further examine the biological plausibility of the model, we built various forms of neural networks in their simplest architecture to transform locations in space into structured states of cognitive map, and units with hexagonal patterns emerge ubiquitously. In short, with empirical experiments, formal proof, and neural network simulation, our study reveals the origin of the hexagonal metric of grid cells mechanistically.