Threshold analysis has recently been proposed to be used in combination with the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) in order to assess the sensitivity to plausible bias of treatment recommendations derived from Bayesian network meta‐analysis (NMA). Here, it was aimed to apply the combination of threshold analysis and GRADE to judge quantitative and qualitative information on risk of bias in antidepressant treatment recommendations. The analysis was based on the data set provided by Cipriani et al. (The Lancet 2018) comparing 21 antidepressants in adult major depressive disorder (MDD). Primary outcomes were efficacy (response rate) and acceptability (dropout rate) adjusted for the covariate depression severity. The combined approach suggested sensitivity to plausible bias to be largest for antidepressant recommendations top ranked by Cipriani et al., that is, amitriptyline, duloxetine, paroxetine, and venlafaxine in terms of efficacy and agomelatine, escitalopram, paroxetine, and venlafaxine in terms of acceptability. Covariate ranges within which recommendations were most sensitive to plausible bias were very severe depression in terms of efficacy (smallest threshold, ie, the largest sensitivity, around 39 Hamilton Depression Rating Scale [HDRS]) and moderate depression in terms of acceptability (smallest thresholds around 16 and 35 HDRS). This indicates that treatment recommendations within these ranges may likely change if plausible bias adjustments take place. The present findings may support decision makers in judging the sensitivity to plausible bias of current antidepressant treatment recommendations to accurately guide treatment decisions in MDD depending on depression severity.