Owing to the constraints of time and space complexity, network intrusion detection systems (NIDSs) based on support vector machines (SVMs) face the “curse of dimensionality” in a large-scale, high-dimensional feature space. This study proposes a joint training model that combines a stacked autoencoder (SAE) with an SVM and the kernel approximation technique. The training model uses the SAE to perform feature dimension reduction, uses random Fourier features to perform kernel approximation, and then random Fourier mapping is explicitly applied to the sub-sample to generate the random feature space, making it possible to apply a linear SVM to uniformly approximate to the Gaussian kernel SVM. Finally, the SAE performs joint training with the efficient linear SVM. We studied the effects of an SAE structure and a random Fourier feature on classification performance, and compared that performance with that of other training models, including some without kernel approximation. At the same time, we compare the accuracy of the proposed model with that of other models, which include basic machine learning models and the state-of-the-art models in other literatures. The experimental results demonstrate that the proposed model outperforms the previously proposed methods in terms of classification performance and also reduces the training time. Our model is feasible and works efficiently on large-scale datasets.