Human gait phase detection is a significance technology for robotics exoskeletons control and exercise rehabilitation therapy. Inertial Measurement Units (IMUs) with accelerometer and gyroscope are convenient and inexpensive to collect gait data, which are often used to analyze gait dynamics for personal daily applications. However, current deep-learning methods that extract spatial and the isolated temporal features can easily ignore the correlation that may exist in the high-dimensional space, which limits the recognition effect of a single model. In this study, an effective hybrid deep-learning framework based on Gaussian probability fusion of multiple spatiotemporal networks (GFM-Net) is proposed to detect different gait phases from multisource IMU signals. Furthermore, it first employs the gait information acquisition system to collect IMU data fixed on lower limb. With the data preprocessing, the framework constructs a spatial feature extractor with AutoEncoder and CNN modules and a multistream temporal feature extractor with three collateral modules combining RNN, LSTM, and GRU modules. Finally, the novel Gaussian probability fusion module optimized by the Expectation-Maximum (EM) algorithm is developed to integrate the different feature maps output by the three submodels and continues to realize gait recognition. The framework proposed in this paper implements the inner loop that also contains the EM algorithm in the outer loop and optimizes the reverse gradient in the entire network. Experiments show that this method has better performance in gait classification with accuracy reaching more than 96.7%.