Our temporally compressive imaging system reconstructs a high-speed image sequence from a single, coded snapshot. The reconstruction quality, similar to that of other compressive sensing systems, often depends on the structure of the measurement, as well as the choice of regularization. In this paper, we report a compressive video system that also captures the side information to aid in the reconstruction of high-speed scenes. The integration of the side information not only improves the quality of reconstruction, but also reduces the dependence of the reconstruction on regularization. We have implemented a system prototype that splits the field of view of a single camera into two channels: one channel captures the coded, low-frame-rate measurement for high-speed video reconstruction, and the other channel captures a direct measurement without coding as the side information. A joint reconstruction model is developed to recover the high-speed videos from the two channels. By analyzing both the experimental and the simulation results, the reconstructions with side information have demonstrated superior performances in terms of both the peak signal-to-noise ratio and structural similarity.