Does the nature of representation in the category-selective regions in the occipitotemporal cortex reflect visual or conceptual properties? Previous research showed that natural variability in visual features across categories, quantified by image gist statistics, is highly correlated with the different neural responses observed in the occipitotemporal cortex. Using fMRI, we examined whether category selectivity for animals and tools would remain, when image gist statistics were comparable across categories. Critically, we investigated how category, shape, and spatial frequency may contribute to the category selectivity in the animal-and tool-selective regions. Female and male human observers viewed low-or highpassed images of round or elongated animals and tools that shared comparable gist statistics in the main experiment, and animal and tool images of naturally varied gist statistics in a separate localizer. Univariate analysis revealed robust category-selective responses for images with comparable gist statistics across categories. Successful classification for category (animals/tools), shape (round/elongated), and spatial frequency (low/high) was also observed, with highest classification accuracy for category. Representational similarity analyses further revealed that the activation patterns in the animal-selective regions were most correlated with a model that represents only animal information, whereas the activation patterns in the tool-selective regions were most correlated with a model that represents only tool information, suggesting that these regions selectively represent information of only animals or tools. Together, in addition to visual features, the distinction between animal and tool representations in the occipitotemporal cortex are likely shaped by higher-level conceptual influences such as categorization or interpretation of visual inputs. Significance Statement: Since different categories often vary systematically in both visual and conceptual features, it remains unclear what kinds of information determine categoryselective responses in the occipitotemporal cortex. To minimize the influences of low-and mid-level visual features, here we used a diverse image set of animals and tools that shared comparable gist statistics. We manipulated category (animals/tools), shape (round/elongated) and spatial frequency (low/high), and found that the representational content of the animal-and tool-selective regions is primarily determined by their preferred categories only, regardless of shape or spatial frequency. Our results show that categoryselective responses in the occipitotemporal cortex are influenced by higher-level processing such as categorization or interpretation of visual inputs, and highlight the specificity in these category-selective regions.