Background. The most common and successful technique for signal denoising with nonstationary signals, such as electroencephalogram (EEG) and electrocardiogram (ECG) is the wavelet transform (WT). The success of WT depends on the optimal configuration of its control parameters which are often experimentally set. Fortunately, the optimality of the combination of these parameters can be measured in advance by using the mean squared error (MSE) function. Method. In this paper, five powerful metaheuristic algorithms are proposed to find the optimal WT parameters for EEG signal denoising which are harmony search (HS), β-hill climbing (β-hc), particle swarm optimization (PSO), genetic algorithm (GA), and flower pollination algorithm (FPA). It is worth mentioning that this is the initial investigation of using optimization methods for WT parameter configuration. This paper then examines which efficient algorithm has obtained the minimum MSE and the best WT parameter configurations. Result. The performance of the proposed algorithms is tested using two standard EEG datasets, namely, Kiern's EEG dataset and EEG Motor Movement/Imagery dataset. The results of the proposed algorithms are evaluated using five common criteria: signal-to-noise-ratio (SNR), SNR improvement, mean square error (MSE), root mean square error (RMSE), and percentage root mean square difference (PRD). Interestingly, for almost all evaluating criteria, FPA achieves the best parameters configuration for WT and empowers this technique to efficiently denoise the EEG signals for almost all used datasets. To further validate the FPA results, a comparative study between the FPA results and the results of two previous studies is conducted, and the findings favor to FPA. Conclusion. In conclusion, the results show that the proposed methods for EEG signal denoising can produce better results than manual configurations based on ad hoc strategy. Therefore, using metaheuristic approaches to optimize the parameters for EEG signals positively affects the denoising process performance of the WT method.