Mobile video calling technologies have become a critical link to connect distributed families. However, these technologies have been principally designed for video calling between two parties, whereas family video calls involve young children often comprise three parties, namely a co-present adult (a parent or grandparent) helping with the interaction between the child and another remote adult. We examine how manipulation of phone cameras and management of co-present children is used to stage parentchild interactions. We present results from a videoethnographic study based on 40 video recordings of video calls between 'left-behind' children and their migrant parents in China. Our analysis reveals a key practice of 'facilitation work', performed by grandparents, as a crucial feature of three-party calls. Facilitation work offers a new concept for HCI's broader conceptualisation of mobile video calling, suggesting revisions that design might take into consideration for triadic interactions in general.