Vibrational sum frequency generation (VSFG) spectroscopy has been a powerful technique to probe molecular structures at non-centrosymmetric media. Recent developed heterodyne (HD) detection can further reveal spectral phase and molecular orientations. Adding imaging capability to an HD VSFG signal can bring spatial visualization capability into this non-linear optical technique. However, it has been a challenge to build an HD VSFG microscope that is both easy to align and has good spectral phase stability -two necessary criterions for the broad application of this technique into various areas of science. Here, we report a fully-collinear HD VSFG microscope, which meets both phase stability and optical alignment requirements that can spatially resolve images of molecular interfaces and domains, with chemical and structural sensitivities. The phase stability is more than nine times better than a Michelson Interferometric HD VSFG microscope. Using this HD VSFG microscope, we study the structures of molecular selfassembly films. Because of the superior phase sensitivity, we successfully identify two molecular domains with different molecular orientations, which we show is not possible to extract from an ensemble-averaged VSFG spectrum or homodyne-detected VSFG image.