Although there is a move toward open data, with research funding bodies more frequently requiring data management plans and dissemination strategies, the data management challenges inherently linked to virtual exchange research are understudied. Data collection is often reported upon in papers addressing interaction analysis or language development, but little attention has been paid to offering critical discussion of data collection and structuration methods or practical advice to encourage data/corpora dissemination. This paper reports on two phases of the Multimodal Teletandem Corpus project (Aranha & Lopes, 2019) that structured 581 hours of video data from Portuguese-English teletandem sessions, 351 chat logs, 956 written productions exchanged between the partners (original, revised, and corrected versions), 91 initial and 41 final questionnaires, and 666 learning diaries. We describe the data management problems faced that included the organization of data collected, ethical consent, management of a large quantity of data, inclusion of sociolinguistic information, expansion of learning theories, and the solutions found. We then outline data management planning steps that, consequently, are being introduced for future telecollaboration instantiations.