Gait phase recognition is crucial for developing wearable lower-limb exoskeleton robots and is a prerequisite for the compliance control of lower-limb exoskeleton robots. Accurately estimating the gait phase is still a key challenge in exoskeleton control. To address these challenges, this study proposes a hybrid model that combines Convolutional Neural Networks (CNN) and Harris Hawks Optimization (HHO)—based Support Vector Machine (SVM). First, the collected sensor signals are preprocessed by normalization to reduce the differences in the data of the subjects. Then, a simplified CNN is used to automatically extract more discriminative features from the dataset. These features are classified using SVM instead of the softmax layer in CNN. In addition, an improved Harris hawk optimization (HHO) algorithm is used to optimize the SVM classification process. This model can accurately identify the heel strike (HS), flat foot (FF), heel off (HO), and swing (SW) phases of the gait cycle. The experimental results show that the CNN-HHO-SVM algorithm can achieve an average phase recognition accuracy of 96.03% for seven subjects in the self-built dataset, which is superior to the traditional method that relies on manually extracting time-frequency features. In addition, the F1-score and macro-recall of the CNN-HHO-SVM algorithm are better than those of other algorithms, which verifies the superiority of the algorithm.