Speech emotion recognition plays a crucial role in analyzing psychological disorders, behavioral decision‐making, and human‐machine interaction applications. However, the majority of current methods for speech emotion recognition heavily rely on data‐driven approaches, and the scarcity of emotion speech datasets limits the progress in research and development of emotion analysis and recognition. To address this issue, this study introduces a new English speech dataset specifically designed for emotion analysis and recognition. This dataset consists of 5503 voices from over 60 English speakers in different emotional states. Furthermore, to enhance emotion analysis and recognition, fast Fourier transform (FFT), short‐time Fourier transform (STFT), mel‐frequency cepstral coefficients (MFCCs), and continuous wavelet transform (CWT) are employed for feature extraction from the speech data. Utilizing these algorithms, the spectrum images of the speeches are obtained, forming four datasets consisting of different speech feature images. Furthermore, to evaluate the dataset, 16 classification models and 19 detection algorithms are selected. The experimental results demonstrate that the majority of classification and detection models achieve exceptionally high recognition accuracy on this dataset, confirming its effectiveness and utility. The dataset proves to be valuable in advancing research and development in the field of emotion recognition.