Computer vision is one of the hottest research directions in artificial intelligence at present, and its research goal is to give computers the ability to perceive and cognize their surroundings from a single image. Image recognition is an important research direction in the field of computer vision, which has important research significance and application value in industrial applications such as video surveillance, biometric identification, unmanned vehicles, human-computer interaction, and medical image recognition. In this article, we propose an end-to-end, pixel-to-pixel IoT-oriented fuzzy support tensor product adaptive image classification method. Considering the problem that traditional support tensor product classification methods are difficult to directly produce pixel-to-pixel classification results, the research is based on the idea of inverse convolution network design, which directly outputs dense pixel-by-pixel classification results for images to be classified of arbitrary size to achieve true end-to-end and pixel-to-pixel high-score image classification and improve the efficiency of support tensor product models for high-score image classification on a pixel-by-pixel basis. Moreover, considering that network supervised classification training using deep learning requires a large amount of labeled data as true values and obtaining a large number of labeled data sources is a difficult problem in the field of image classification, this article proposes using a large amount of unlabeled high-resolution remote sensing images for learning generic structured features through unsupervised to assist the labeled high-resolution remote sensing images for better-supervised feature extraction and classification training. By finding a balance between generic structural feature learning of images and differentiated feature learning related to the target class, the dependence of supervised classification on the number of labeled samples is reduced, and the network robustness of the support tensor product algorithm is improved under a small number of labeled training samples.