Depression is a prevalent comorbidity in patients with severe physical disorders, such as cancer, stroke, and coronary diseases. Although it can significantly impact the course of the primary disease, the signs of depression are often underestimated and overlooked. The aim of this paper was to review algorithms for the automatic, uniform, and multimodal classification of signs of depression from human conversations and to evaluate their accuracy. For the scoping review, the PRISMA guidelines for scoping reviews were followed. In the scoping review, the search yielded 1095 papers, out of which 20 papers (8.26%) included more than two modalities, and 3 of those papers provided codes. Within the scope of this review, supported vector machine (SVM), random forest (RF), and long short-term memory network (LSTM; with gated and non-gated recurrent units) models, as well as different combinations of features, were identified as the most widely researched techniques. We tested the models using the DAIC-WOZ dataset (original training dataset) and using the SymptomMedia dataset to further assess their reliability and dependency on the nature of the training datasets. The best performance was obtained by the LSTM with gated recurrent units (F1-score of 0.64 for the DAIC-WOZ dataset). However, with a drop to an F1-score of 0.56 for the SymptomMedia dataset, the method also appears to be the most data-dependent.