Slope failures occur when parts of a slope collapse abruptly under the influence of gravity, often triggered by a rainfall event or earthquake. The resulting slope failures often cause problems in mountainous or hilly regions, and the detection of slope failure is therefore an important topic for research. Most of the methods currently used for mapping and modelling slope failures rely on classification algorithms or feature extraction, but the spatial complexity of slope failures, the uncertainties inherent in expert knowledge, and problems in transferability, all combine to inhibit slope failure detection. In an attempt to overcome some of these problems we have analyzed the potential of deep learning convolutional neural networks (CNNs) for slope failure detection, in an area along a road section in the northern Himalayas, India. We used optical data from unmanned aerial vehicles (UAVs) over two separate study areas. Different CNN designs were used to produce eight different slope failure distribution maps, which were then compared with manually extracted slope failure polygons using different accuracy assessment metrics such as the precision, F-score, and mean intersection-over-union (mIOU). A slope failure inventory data set was produced for each of the study areas using a frequency-area distribution (FAD). The CNN approach that was found to perform best (precision accuracy assessment of almost 90% precision, F-score 85%, mIOU 74%) was one that used a window size of 64 × 64 pixels for the sample patches, and included slope data as an additional input layer. The additional information from the slope data helped to discriminate between slope failure areas and roads, which had similar spectral characteristics in the optical imagery. We concluded that the effectiveness of CNNs for slope failure detection was strongly dependent on their design (i.e., the window size selected for the sample patch, the data used, and the training strategies), but that CNNs are currently only designed by trial and error. While CNNs can be powerful tools, such trial and error strategies make it difficult to explain why a particular pooling or layer numbering works better than any other.