The essential human gait parameters are briefly reviewed, followed by a detailed review of the state of the art in deep learning for the human gait analysis. The modalities for capturing the gait data are grouped according to the sensing technology: video sequences, wearable sensors, and floor sensors, as well as the publicly available datasets. The established artificial neural network architectures for deep learning are reviewed for each group, and their performance are compared with particular emphasis on the spatiotemporal character of gait data and the motivation for multi-sensor, multi-modality fusion. It is shown that by most of the essential metrics, deep learning convolutional neural networks typically outperform shallow learning models. In the light of the discussed character of gait data, this is attributed to the possibility to extract the gait features automatically in deep learning as opposed to the shallow learning from the handcrafted gait features.