This paper presents a novel self-adaptive approach for speech enhancement in the context of highly nonstationary noise. A two-stage deep neuroevolutionary technique for speech enhancement is proposed. The first stage is composed of a deep neural network (DNN) method for speech enhancement. Two DNN methods were tested at this stage, namely, both a deep complex convolution recurrent network (DCCRN) and a residual long short-term memory neural network (ResLSTM). The ResLSTM method was combined with a minimum mean-square error method to perform a preliminary enhancement. The ResLSTM network is used as an a priori signal-to-noise ratio (SNR) estimator. The second stage implements a self-adaptive multiband spectral subtraction enhancement method using tuning optimization based on a genetic algorithm. The proposed two-stage technique is evaluated using objective measures of speech quality and intelligibility. The experiments are carried out using the NOIZEUS noisy speech corpus using conditions of real-world stationary, colored, and nonstationary noise sources at multiple SNR levels. These experiments demonstrate the advantage of building a cooperative approach using evolutionary and deep learning-based techniques that are capable of achieving robust speech enhancement in adverse conditions. Indeed, the experimental tests show that the proposed two-stage technique outperformed a baseline implementation using a state-of-the-art deep learning approach by an average 13% and 6% improvement for six noise conditions at a −5 dB and a 0 dB input SNR, respectively.