Traditional hyperspectral imagers rely on scanning either the spectral or spatial dimension of the hyperspectral cube with spectral filters or line-scanning which can be time consuming and generally require precise moving parts, increasing the complexity. More recently, snapshot techniques have emerged, enabling capture of the full hyperspectral datacube in a single shot. However, some types of these snapshot system are bulky and complicated, which is difficult to apply to the real world. Therefore, this paper proposes a compact snapshot hyperspectral imaging system based on compressive theory, which consists of the imaging lens, light splitter, micro lens array, a metasurface-covered sensor and an RGB camera. The light of the object first passes through the imaging lens, and then a splitter divides the light equally into two directions. The light in one direction pass through the microlens array and then the light modulation is achieved by using a metasurface on the imaging sensor. Meanwhile, the light in another direction is received directly by an RGB camera. This system has the following advantages: first, the metasurface supercell can be well designed and arranged to optimize the transfer matrix of the system; second, the microlens array guarantee that the light incident on the metasurface at a small angle, which eliminate the transmittance error introduced by the incidence angle; third, the RGB camera is able to provide side information and help to ease the reconstruction.