Many educational institutions have been using online judges in programming classes, amongst others, to provide faster feedback for students and to reduce the teacher's workload. There is some evidence that online judges also help in reducing dropout. Nevertheless, there is still a high level of dropout noticeable in introductory programming classes. In this sense, the objective of this work is to develop and validate a method for predicting student dropout using data from the first two weeks of study, to allow for early intervention. Instead of the classical questionnaire-based method, we opted for a non-subjective, data-driven approach. However, such approaches are known to suffer from a potential overload of factors, which may not all be relevant to the prediction task. As a result, we reached a very promising 80% of accuracy, and performed explicit extraction of the main factors leading to student dropout.
Many researchers have started extracting student behaviour by cleaning data collected from web environments and using it as features in machine learning (ML) models. Using log data collected from an online judge, we have compiled a set of successful features correlated with the student grade and applying them on a database representing 486 CS1 students. We used this set of features in ML pipelines which were optimised, featuring a combination of an automated approach with an evolutionary algorithm and hyperparameter-tuning with random search. As a result, we achieved an accuracy of 75.55%, using data from only the first two weeks to predict the student final grades. We show how our pipeline outperforms state-of-the-art work on similar scenarios.
Computing-related undergraduate students are encouraged to participate in Project-based Learning (PBL) courses through capstone courses in order to bridge the gap between software engineering (SE) educational and industrial worlds. In these courses, students improve their skills on industrial tools and processes and engage in real-world projects. One of the challenges of this kind of courses is how to monitor students' progress. In this work, we propose a software tool based on statistical analysis and data-mining algorithms to investigate the usefulness of students' communication logs to support professors' pedagogical activities during a capstone course involving three different SE disciplines. Our results indicate the feasibility of using textual content and metadata content extracted from Slack logs to identify opportunities for the professor's intervention. A quantitative analyze reveals an average precision of 81% at identifying the top-5 relevant sentences registered in the log.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.