Objective: Brain-computer interfaces (BCIs) are emerging as promising cognitive training tools in neurodevelopmental disorders, as they combine the advantages of traditional computerized interventions with real-time tailored feedback. We propose a gamified BCI based on non-volitional neurofeedback for cognitive training, aiming at reaching a neurorehabilitation tool for application in Autism Spectrum Disorders (ASD). Approach: The BCI consists of an emotional facial expression paradigm (EFP) controlled by an intelligent agent that makes correct and wrong actions, while the user observes and judges the agent’s actions. The agent learns through reinforcement learning (RL) an optimal strategy if the participant generates error-related potentials (ErrPs) upon incorrect agent actions. We hypothesize that this training approach will allow not only the agent to learn but also the BCI user, by participating through implicit error scrutiny in the process of learning through operant conditioning, making it of particular interest for disorders where error monitoring processes are altered/compromised such as in ASD. In this paper, the main goal is to validate the whole methodological BCI approach and assess whether it is feasible enough to move on to clinical experiments. A control group of 10 neurotypical participants and one participant with ASD tested the proposed BCI approach. Main Results: We achieved an online balanced-accuracy in ErrPs detection of 81.6% and 77.1%, respectively for 2 different game modes. Additionally, all participants achieved an optimal RL strategy for the agent at least in one of the test sessions. Significance: The ErrP classification results and the possibility of successfully achieving an optimal learning strategy, show the feasibility of the proposed methodology, which allows to move towards clinical experimentation with ASD participants to assess the effectiveness of the approach as hypothesized.