Near-infrared-visible (NIR-VIS) heterogeneous face recognition matches NIR to corresponding VIS face images. However, due to the sensing gap, NIR images often lose some identity information so that the NIR-VIS recognition issue is more difficult than conventional VIS face recognition. Recently, NIR-VIS heterogeneous face recognition has attracted considerable attention in the computer vision community because of its convenience and adaptability in practical applications. Various deep learning-based methods have been proposed and substantially increased the recognition performance, but the lack of NIR-VIS training samples leads to the difficulty of the model training process. In this paper, we propose a new Large-Scale Multi-Pose High-Quality NIR-VIS database 'LAMP-HQ' containing 56,788 NIR and 16,828 VIS images of 573 subjects with large diversities in pose, illumination, attribute, scene and accessory. We furnish a benchmark along with the protocol for NIR-VIS face recognition via generation on LAMP-HQ, including Pixel2-Pixel, CycleGAN, ADFL, PCFH, and PACH. Furthermore, we propose a novel exemplar-based variational spectral attention network to produce high-fidelity VIS images from NIR data. A spectral conditional attention module is introduced to reduce the domain gap between NIR and VIS data and then improve the performance of NIR-VIS heterogeneous face recognition on various databases including the LAMP-HQ.