Before the cognitive map is generated through the fire of the rodent hippocampal spatial cells, mammals can obtain the outside knowledge through the visual information, which comes from the eyeball to the brain. The information is encoded and transferred to the two regions of the brain based on the fact of biophysiological research, which are known as “what” loop and “where” loop. In this article, we simulate an episodic memory recognition unit consisting of the integration of two-loop information, which is applied to building the accurate bioinspired spatial cognitive map of real environments. We employ the visual bag of word algorithm based on oriented Feature from Accelerated Segment Test and rotated Binary Robust Independent Elementary Features feature to build the “what” loop and the hippocampal spatial cells cognitive model, which comes from the front-end visual information input system to build the “where” loop. At the same time, the environmental cognitive map is a topological map containing information about place cell competition firing rate, oriented Feature from Accelerated Segment Test and rotated Binary Robust Independent Elementary Features feature descriptor, similarity of image retrieval, and relative location of cognitive map nodes. The simulation experiments and physical experiments in a mobile robot platform have been done to verify the environmental adaptability and robustness of the algorithm. This proposing algorithm would provide a foundation for further research on bioinspired navigation of robots.