Stance-taking, the public act of positioning oneself towards objects, peoples or states of affair, has been studied in many fields of research. Recently, its multimodal realization in interaction has received increasing attention. The current contribution aims to take stock of research on multimodal stance-taking so far, and present possible avenues for future research. We systematically gathered and appraised 76 articles that investigate the involvement of bodily-visual resources in stance-taking in interaction. The critical appraisal focused on two dimensions of the stance act: form-function relations constituting it, and its dynamic organization in interaction. Regarding form-function relations, we found systematic involvement of specific bodily-visual resources in different stance-acts, as well as analyses of how stances can be intensified or downplayed multimodally. As for its dynamic organization, the review discusses how stance-taking is organized temporally throughout an interaction, with all participants involved carefully negotiating and adapting their stances to one another. Finally, attention is paid to the broader context of stance-taking, including its role in different social and societal contexts. Based on the review, we were able to identify several gaps in the literature, and avenues for future research. We argue that much potential for broadening the scope of research lies in increasing the methodological diversity in approaching multimodal stance-taking, as well as in cross-linguistic studies and varying settings and participant constellations. In conclusion, research into multimodal stance-taking is vibrant, with ample opportunities for future work. This review can be considered as a call to action to move beyond the premise that stance-taking is multimodal, and further investigate this intriguing and fundamentally human capacity.