“…There are two essential features of our computational model. First, unlike previous SOM models that usually take as input the stimulus space (or pixel space) to simulate the lower visual cortex (Bednar & Miikkulainen, 2003;Durbin & Mitchison, 1990;Durbin & Willshaw, 1987;Kohonen, 1989;Konkle, 2021;Linsker, 1988), our SOM model takes a high-dimensional object representation space as the input and outputs a tuned map (i.e., an artificial cortical surface of the VTC), in which nearby units in the map project to nearby points in the object space (Figure 1). The highdimensional object representation space is obtained via a pre-trained deep convolutional neural network (DCNN), the AlexNet, because numerous studies have demonstrated a striking similarity in the response profile and functional hierarchy between the AlexNet and human ventral visual cortex (e.g., Cichy, Khosla, Pantazis, Torralba, & Oliva, 2016;Guclu & van Gerven, 2015;Khaligh-Razavi & Kriegeskorte, 2014;Liu, Zhen, & Liu, 2020;Wen, Shi, Zhang, Lu, Cao, & Liu, 2018;Yamins, Hong, Cadieu, Solomon, Seibert, & DiCarlo, 2014).…”