Objective. Steady-state visual evoked potentials(SSVEPs) based braincomputer interface(BCI) has received great interests owing to the high information transfer rate(ITR) and available large number of targets. However, the performance of frequency recognition methods heavily depends on the amount of the calibration data for intra-subject classification. Some research adopted the deep learning(DL) algorithm to conduct the inter-subject classification, which could reduce the calculation procedure, but the performance still has large room to improve compared with the intra-subject classification. Approach. To address these issues, we proposed an efficient SSVEP DL NETwork (termed SSVEPNET) based on 1D convolution and long short-term memory (LSTM) module. To enhance the performance of SSVEPNT, we adopted the spectral normalization and label smoothing technologies during implementing the network architecture. We evaluated the SSVEPNET and compared it with other methods for the intra- and inter-subject classification under different conditions, i.e., two datasets, two time-window lengths (1 s and 0.5 s), three sizes of training data. Main results. Under all the experimental settings, the proposed SSVEPNET achieved the highest average accuracy for the intra- and inter-subject classification on the two SSVEP datasets, when compared with other traditional and DL baseline methods. Signif icance. The extensive experimental results demonstrate that the proposed DL model holds promise to enhance frequency recognition performance in SSVEP-based BCIs. Besides, the mixed network structures with CNN and LSTM, and the spectral normalization and label smoothing could be useful optimization strategies to design efficient models for EEG data.