Near‐infrared (NIR) spectral information is important for detecting and analyzing material compositions. However, snapshot NIR spectral imaging systems still pose significant challenges owing to the lack of high‐performance NIR filters and bulky setups, preventing effective encoding and integration with mobile devices. In this study, we introduce a snapshot spectral imaging system that employs a compact NIR metasurface featuring 25 distinct C4 symmetry structures. Benefitting from the sufficient spectral variety and low correlation coefficient among these structures, center‐wavelength accuracy of 0.05 nm and full width at half maximum (FWHM) accuracy of 0.13 nm are realized. The system maintains good performance within an incident angle of 1°. A novel meta‐attention network prior iterative denoising reconstruction (MAN‐IDR) algorithm guided by both meta‐attention and spatial‐spectral attention is developed to achieve high‐quality NIR spectral imaging. By leveraging the designed metasurface and MAN‐IDR, the NIR spectral images, exhibiting precise textures, minimal artifacts in the spatial dimension, and little crosstalk between spectral channels, are reconstructed from a single grayscale recording image. Furthermore, the reconstruction of images in different colors with spatial overlaps is successfully demonstrated, showing the remarkable potential of the system in distinguishing objects. By enabling practical image reconstruction and distinction, the proposed NIR metasurface and MAN‐IDR hold great promise for further integration with smartphones and drones, guaranteeing the adoption of NIR spectral imaging in real‐world scenarios such as aerospace, health diagnostics, and machine vision.This article is protected by copyright. All rights reserved