BACKGROUND AND OBJECTIVE: EPELI (Executive Performance of Everyday LIving) is a Virtual Reality (VR) task that has been developed to study goal-directed behavior in everyday life contexts. To see if an immersive version implemented with a head-mounted display (HMD) and a non-immersive version employing a flat screen display (FSD) yield similar results, we had 72 typically developing 9- to 13-year-old children to play both versions in a counterbalanced order. The children’s everyday executive functions were assessed with the parent-rated Behavior Rating Inventory for Executive Functions (BRIEF) questionnaire. To assess the applicability of EPELI for online testing, half of the FSD version gameplays were conducted remotely and the rest in the laboratory.RESULTS: All EPELI performance measures were correlated across the versions. The children’s performance was mostly similar in the two versions, but small effects reflecting higher performance in FSD-EPELI were found in the measures of Total score, Task efficacy, and Time-based prospective memory score. The children engaged in more active time monitoring in FSD-EPELI. While the children evaluated the feeling of presence and the usability of both versions favorably, most children preferred HMD-EPELI, and evaluated its environment to be more involving and realistic. Both versions showed only negligible problems with the interface quality. No differences in task performance or subjective evaluations were found between the home-based and laboratory-based assessments of FSD-EPELI. In both EPELI versions, the efficacy measures were correlated with BRIEF on the first assessment, but not on the second. This raises questions about the stability of the associations reported between executive function tasks and questionnaires. CONCLUSIONS: Both the HMD and FSD versions of EPELI are viable tools for the naturalistic assessment of goal-directed behavior in children. While the HMD version provides a more immersive user experience and naturalistic movement tracking, the FSD version can maximize scalability, reachability, and cost efficacy, as it can be used with common hardware and remotely. Taken together, the findings highlight similarities between the HMD and FSD versions of a cognitively complex VR task, but also underline the specific advantages of these common presentation modes.