Automatic Short Answer Grading is the study field that addresses the assessment of students' answers to questions in natural language. Besides length, it differs from automatic essay grading by focusing on the evaluation of content instead of answer's style. The grading of the answers is generally seen as a typical classification supervised learning. Many works have been recently developed, but most of them deal with data in the English language. In this paper, we present a new Portuguese dataset and system for automatic short answer grading. The data was collected with the participation of 13 teachers, 12 undergraduate students and 245 elementary school students. Results achieved 69% accuracy in four-class classification and 85% on binary classification.
Resumo. Avaliações são muito utilizadas nos contextos de aprendizagem para verificar quanto conhecimento está sendo retido pelos alunos. Questões discursivas podem avaliar níveis diferentes do aprendizado dos alunos, quando comparadas com questões de múltipla escolha. Entretanto, devido a sua facilidade na correção, questões de múltipla escolha são geralmente mais utilizadas. Visando auxiliar nesse problema e apresentar ao professor uma ferramenta que lhe permita aplicar avaliaç ões com questões discursivas sem receio do tempo de correção, surge o Auto-Avaliador Colaborativo e Inteligente de Respostas, uma ferramenta para a avaliação automática deste tipo de questões.
Short answers are routinely used in learning environments for students’ assessment. Despite its importance, teachers find the task of assessing discursive answers very time-consuming. Aiming at assisting in this problem, this work explores the Automatic Short Answer Grading (ASAG) field using a machine learning approach. The literature was reviewed and 44 papers using different techniques were analyzed considering many aspects. A Portuguese dataset was build with more than 7000 short answers. Different approaches were experimented and a final model was created with their combination. The model’s effectiveness showed to be satisfactory, with kappa scores indicating moderate/substantial agreement between the model and human grading.
Automatic short answer grading is the study field that addresses the assessment of students’ answers to questions in natural language. The grading of the answers is generally seen as a typical classification supervised learning. To stimulate research in the field, two datasets were publicly released in the SemEval 2013 competition task “Student Response Analysis”. Since then, some works have been developed to improve the results. In this context, the goal of this work is to tackle such task by implementing lessons learned from the literature in an effective way and report results for both datasets and all of its scenarios. The proposed method obtained better results in most scenarios of the competition task and, therefore, higher overall scores when compared to recent works.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.