Algorithms and programming are some of the most challenging topics faced by students during undergraduate programs. Dropout and failure rates in courses involving such topics are usually high, which has raised attention towards the development of strategies to attenuate this situation. Machine learning techniques can help in this direction by providing models able to detect at-risk students earlier. Therefore, lecturers, tutors or staff can pedagogically try to mitigate this problem. To early predict at-risk students in introductory programming courses, we present a comparative study aiming to find the best combination of datasets (set of variables) and classification algorithms. The data collected from Moodle was used to generate 13 distinct datasets based on different aspects of student interactions (cognitive presence, social presence and teaching presence) inside the virtual environment. Results show there are no statistically significant difference among models generated from the different datasets and that the counts of interactions together with derived attributes are sufficient for the task. The performances of the models varied for each semester, with the best of them able to detect students at-risk in the first week of the course with AUC ROC from 0.7 to 0.9. Moreover, the use of SMOTE to balance the datasets did not improve the performance of the models.
Although oral cancer is considered a global health issue with 350,000 people diagnosed over a year, it can successfully be treated if diagnosed at early stages. Papanicolaou is an inexpensive and non-invasive method, generally applied to detect cervical cancer, but it can also be useful to detect cancer on oral cavities. The manual process of analyzing cells to detect abnormalities is a time-consuming cell analysis and is subject to variations in perceptions from different professionals. This paper compares three different deep learning (DL) approaches: segmentation, object detection, and image classification. Our results show that the binary object detection with Faster R-CNN is the best approach for nuclei detection and localization (0.76 IoU). Since ResNet 34 had a good performance on abnormal nuclei classification (0.86 F 1 score), we concluded that these two models can be used in combination to perform a reliable localization and classification pipeline. This work reinforces that the automated analysis of oral cytology to build a pipeline for nuclei classification and localization using DL can contribute to minimize the subjectivity of the human analysis and also support the detection of cancer at early stages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.