Hyperspectral (HS) pansharpening aims at fusing a low-resolution HS (LRHS) image with a high-resolution panchromatic (PAN) image to obtain a hyperspectral image with both higher spectral and spatial resolutions. However, existing HS pansharpening algorithms are mainly based on multispectral (MS) pansharpening approaches, which cannot perfectly restore much spectral information and more high-frequency spatial details in the continuous spectral bands and much broader spectral range, leading to spectral distortion and spatial blur. In this paper, we develop a new hyperspectral pansharpening network architecture (called Hyper-DSNet) to fully preserve latent spatial details and spectral fidelity via a deep-shallow fusion structure with multi-detail extractor and spectral attention. Specifically, the proposed architecture mainly consists of three parts. First, to solve the problem of spatial ambiguity and exploit the potential information, five types of high-pass filter templates are used to fully extract the spatial details of the PAN image, constructing a so-called multi-detail extractor (MDE). Then, after passing a multi-scale convolution module, a deep-shallow fusion (DSF) structure, which reduces parameters by decreasing the number of output channels as the network goes deeper, is utilized sequentially. In final, a spectral attention (SA) module is conducted to preserve the spectrum for a wealth of spectral information of HS images. Visual and quantitative experiments on three commonly used simulated datasets and one full-resolution dataset demonstrate the effectiveness and robustness of the proposed Hyper-DSNet against the recent state-ofthe-art hyperspectral pansharpening techniques. Ablation studies and discussions further verify our contributions, e.g., better spectral preservation and spatial detail recovery. The source code is available at https://github.com/liangjiandeng/Hyper-DSNet, and the related datasets can be found from https://github.com/liangjiandeng/HyperPanCollection.