Multi-Modal Data (MMD) can help educational games researchers understand the synergistic relationship between player's movement and their learning experiences, and consequently uncover insights that may lead to improved design of movement-based game technologies for learning. Predicting player performance fosters opportunities to cultivate heightened educational experiences and outcomes. However, predicting player's performance utilising player-generated MMD during their interactions with educational Motion-Based Touchless Games (MBTG) is challenging. To bridge this gap, we implemented an in-situ study where 26 users, age 11, played 2 maths MBTGs in a single 20-30 minute session. We collected player's MMD (i.e., gaze data from eye-tracking glasses, physiological data from wristbands, and skeleton data from Kinect) produced during game-play. To investigate the potential of MMD for predicting player's academic performance, we used machine learning techniques and MMD derived from player's game-play. This allowed us to identify the MMD features that drive rapid highly accurate predictions of players' academic performance in educational MBTGs. This might allow us to provide realtime proactive feedback to the player to support them through their educational gaming experience. Our analysis compared two data lengths corresponding to half and full duration of the player's question solving time. We showed that all combinations of extracted features associated with gaze, physiological, and skeleton data, predicted student performance more accurately than the majority baseline. Additionally, the most accurate prediction of player's performance derived from the combination of gaze and physiological data for both full and half data lengths. Our findings emphasise the significance of using MMD for realtime performance prediction in educational MBTG and offer implications for practice.