The prosody of a second language (L2) is notoriously difficult to acquire. It requires the mastery of a range of nested multimodal systems, including articulatory but also gestural signals, as hand gestures are produced in close synchrony with spoken prosody. It remains unclear how easily the articulatory and gestural systems acquire new prosodic patterns in the L2 and how the two systems interact, especially when L1 patterns interfere. This interdisciplinary pre-registered study investigates how Dutch learners of Spanish produce multimodal lexical stress in Spanish-Dutch cognates (e.g., Spanish profeSOR vs. Dutch proFESsor). Acoustic analyses assess whether gesturing helps L2 speakers to place stress on the correct syllable; and whether gesturing boosts the acoustic correlates of stress through biomechanic coupling. Moreover, motion-tracking and time-series analyses test whether gesture-prosody synchrony is enhanced for stress-matching vs. stress-mismatching cognate pairs, perhaps revealing that gestural timing is biased in the L1 (or L2) direction (e.g., Spanish profeSOR with the gesture biased towards Dutch stressed syllable -fes). Thus, we will uncover how speakers deal with manual, articulatory, and cognitive constraints that need to be brought in harmony for efficient speech production, bearing implications for theories on gesture-speech interaction and multimodal L2 acquisition.