<p>It is important to classify electroencephalography (EEG) signals automatically for the diagnosis and treatment of epilepsy. Currently, the dominant single-modal feature extraction methods cannot cover the information of different modalities, resulting in poor classification performance of existing methods, especially the multi-classification problem. We proposed a multi-modal feature fusion (MMFF) method for epileptic EEG signals. First, the time domain features were extracted by kernel principal component analysis, the frequency domain features were extracted by short-time Fourier extracted transform, and the nonlinear dynamic features were extracted by calculating sample entropy. On this basis, the features of these three modalities were interactively learned through the multi-head self-attention mechanism, and the attention weights were trained simultaneously. The fused features were obtained by combining the value vectors of feature representations, while the time, frequency, and nonlinear dynamics information were retained to screen out more representative epileptic features and improve the accuracy of feature extraction. Finally, the feature fusion method was applied to epileptic EEG signal classifications. The experimental results demonstrated that the proposed method achieves a classification accuracy of 92.76 ± 1.64% across the five-category classification task for epileptic EEG signals. The multi-head self-attention mechanism promotes the fusion of multi-modal features and offers an efficient and novel approach for diagnosing and treating epilepsy.</p>