We set out to test whether positive non-verbal behaviours of a virtual coach can enhance people's engagement in automated virtual reality therapy. 120 individuals scoring highly for fear of heights participated. In a two-by-two factor, between-groups, randomised design, participants met a virtual coach that varied in warmth of facial expression (with/without) and affirmative nods (with/without). The virtual coach provided a consultation about treating fear of heights. Participants rated the therapeutic alliance, treatment credibility, and treatment expectancy. Both warm facial expressions (group difference = 7.44 [3.25, 11.62], p = 0.001, $${eta}_{p}^{2}$$
eta
p
2
=0.10) and affirmative nods (group difference = 4.36 [0.21, 8.58], p = 0.040, $${eta}_{p}^{2}$$
eta
p
2
= 0.04) by the virtual coach independently increased therapeutic alliance. Affirmative nods increased the treatment credibility (group difference = 1.76 [0.34, 3.11], p = 0.015, $${eta}_{p}^{2}$$
eta
p
2
= 0.05) and expectancy (group difference = 2.28 [0.45, 4.12], p = 0.015, $${eta}_{p}^{2}$$
eta
p
2
= 0.05) but warm facial expressions did not increase treatment credibility (group difference = 0.64 [− 0.75, 2.02], p = 0.363, $${eta}_{p}^{2}$$
eta
p
2
= 0.01) or expectancy (group difference = 0.36 [− 1.48, 2.20], p = 0.700, $${eta}_{p}^{2}$$
eta
p
2
= 0.001). There were no significant interactions between head nods and facial expressions in the occurrence of therapeutic alliance (p = 0.403, $${eta}_{p}^{2}$$
eta
p
2
= 0.01), credibility (p = 0.072, $${eta}_{p}^{2}$$
eta
p
2
= 0.03), or expectancy (p = 0.275, $${eta}_{p}^{2}$$
eta
p
2
= 0.01). Our results demonstrate that in the development of automated VR therapies there is likely to be therapeutic value in detailed consideration of the animations of virtual coaches.