Whereas social visual attention has been examined in computer-mediated (e.g., shared screen) or video-mediated (e.g., FaceTime) interaction, it has yet to be studied in mixed-media interfaces that combine video of the conversant along with other UI elements. We analyzed eye gaze of 37 dyads (74 participants) who were tasked with negotiating the price of a new car (as a buyer and seller) using mixed-media video conferencing under competitive or cooperative negotiation instructions (experimental manipulation). We used multidimensional recurrence quantification analysis to extract spatio-temporal patterns corresponding to mutual gaze (individuals look at each other), joint attention (individuals focus on the same elements of the interface), and gaze aversion (an individual looks at their partner, who is looking elsewhere). Our results indicated that joint attention predicted the sum of points attained by the buyer and seller (i.e., the joint score). In contrast, gaze aversion was associated with faster time to complete the negotiation, but with a lower joint score. Unexpectedly, mutual gaze was highly infrequent and unrelated to the negotiation outcomes and none of the gaze patterns predicted subjective perceptions of the negotiation. There were also no effects of gender composition or negotiation condition on the gaze patterns or negotiation outcomes. Our results suggest that social visual attention may operate differently in mixed-media collaborative interfaces than in face-to-face interaction. As mixed-media collaborative interfaces gain prominence, our work can be leveraged to inform the design of gaze-sensitive user interfaces that support remote negotiations among other tasks.