SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

Shin, Dongmin; Shim, Yugeun; Yu, Hangyeol; Lee, Seewoo; Kim, Byungsoo; Choi, Youngduck

doi:10.1145/3448139.3448188

Cited by 92 publications

(40 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Multiple more recent approaches appear promising (Nakagawa et al, 2019;Ghosh et al, 2020;Cheng et al, 2020;Pandey and Srivastava, 2020;Oya and Morishima, 2021;Shin et al, 2021;Song et al, 2021), and many claim significant performance improvements, but their results still require verification via replication studies. Also, many of these newer models include slightly different inputs, such as skill and previous correctness as separate inputs , additional time related inputs (Shin et al, 2021) or leveraging both exercise and skill labels (Song et al, 2021), giving rise to the question of whether and how much the older architectures would also benefit from such input additions. We agree with this direction of exploring new inputs since DLKT models are powerful models that are likely to benefit from such additional information.…”

Section: Evaluation Results -No Silver Bulletmentioning

confidence: 99%

“…These include e.g. DKVMN (Zhang et al, 2017), SKVMN (Abdelrahman and Wang, 2019), DQN (Lee and Yeung, 2019), GNNKT (Nakagawa et al, 2019), SAKT (Pandey and Karypis, 2019), SAINT , SAINT+ (Shin et al, 2021), and JKT (Song et al, 2021), many of which include additional inputs and adjust the layer structure by introducing techniques from the broader machine learning domain.…”

Section: Deep Learning Models For Knowledge Tracingmentioning

confidence: 99%

“…For example, Pu et al (Pu et al, 2020) and Choi et. al have proposed Transformer-based DLKT models of which the latter has been further extended by Shin et al (Shin et al, 2021), Pandey and Srivastava (Pandey and Srivastava, 2020) have also proposed a Transformer DLKT model (RKT) with the addition of including contextrelated information such as the textual content of the questions as input, Liu et al proposed an exercise enhanced Bi-Directional LSTM with attention model called EKT which leverages both exercise and concept (skill) information, Ghosh et al (Ghosh et al, 2020) developed an attention based DLKT model that fuses DLKT with IRT, Nakagawa et al (Nakagawa et al, 2019) proposed a graph based DLKT model and Song et al (Song et al, 2021) further proposed another graph based DLKT model which, like EKT, models both exercise and skill relations, Yudelson has examined Elo-rating based models for estimating student knowledge (Yudelson, 2019), and Ghosh et al have developed option tracing where the exact choice a student makes is predicted instead of just answer correctness (Ghosh et al, 2021). Limiting the number of evaluated models was a deliberate choice, as we meticulously re-implemented the algorithms as well as compared and contrasted the available implementations with the details outlined in the respective articles.…”

Section: Limitations Of Work -This Work Is Not Perfect Eithermentioning

confidence: 99%

See 2 more Smart Citations

Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of Hyperparameters and Metrics on Performance and Replicability

Sarsa¹,

Leinonen²,

Hellas³

2021

Preprint

View full text Add to dashboard Cite

Knowledge tracing models, which are used to estimate students' ability or knowledge based on data collected from students' work on learning-related tasks, are a widely studied area within the educational data mining domain. In this work, we review and evaluate a body of deep learning knowledge tracing (DLKT) models with openly available and widely-used data sets, and with a novel data set of students learning to program. The evaluated DLKT models have been re-implemented for assessing reproducibility of previously reported results as well as for assessing the level of detail with which the models and their evaluations have been reported in previously published articles. The DLKT models are tested with different input and output layer variations found in the compared models that are independent of the main architectures of the models and also with different maximum attempt count options. Several metrics to compare and contrast the results are used to reflect on the quality and appropriateness of the evaluated knowledge tracing models. The evaluated knowledge tracing models include Vanilla-DKT, two variants of Long Short-Term Memory Deep Knowledge Tracing (LSTM-DKT), two variants of Dynamic Key-Value Memory Network (DKVMN), and Self-Attentive Knowledge Tracing (SAKT). We evaluate logistic regression, Bayesian Knowledge Tracing (BKT) and simple non-learning models as baselines. Our empirical evaluation suggests that while the DLKT models with tuned hyperparameters in general outperform non deep learning based models, the relative differences between the DLKT models are subtle and often vary between datasets. Specifically, we observe that no model consistently outperforms all other models in all datasets. Our results also show that simple non-learning models such as mean prediction can yield better performance than more sophisticated knowledge tracing models, especially in terms of accuracy, in some datasets. Further, our metric and hyperparameter analysis shows that the metric used to select the best model hyperparameters has a noticeable effect on the performance of the models, and also that some metrics appear more favorable than others for certain models. We also study the effect of input and output layer variations on model performance, and analyze the impact of filtering out long attempt sequences that has been implicitly and explicitly used in some studies. We further discuss the effect of non-model properties such as randomness and hardware on model performance, and finally, we discuss model performance replicability and related issues including pitfalls and suggest practices for future work. Our model implementations, evaluation code, and data are published as a part of this work.

show abstract

Section: Evaluation Results -No Silver Bulletmentioning

confidence: 99%

Section: Deep Learning Models For Knowledge Tracingmentioning

confidence: 99%

Section: Limitations Of Work -This Work Is Not Perfect Eithermentioning

confidence: 99%

See 1 more Smart Citation

Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of Hyperparameters and Metrics on Performance and Replicability

Sarsa¹,

Leinonen²,

Hellas³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…• SAINT+ [37] , the first Transformer-based Knowledge Tracing model, is unique in that it introduces exercise information as well as student response information separately, while at the same time it embeds two temporal features, elapsed time and lag time, into the embedding of student response information.…”

Section: Evaluation Methodsmentioning

confidence: 99%

Bi-CLKT: Bi-Graph Contrastive Learning based Knowledge Tracing

Song¹,

Li²,

Liu³

et al. 2022

Preprint

View full text Add to dashboard Cite

The goal of Knowledge Tracing (KT) is to estimate how well students have mastered a concept based on their historical learning of related exercises. The benefit of knowledge tracing is that students' learning plans can be better organised and adjusted, and interventions can be made when necessary. With the recent rise of deep learning, Deep Knowledge Tracing (DKT) has utilised Recurrent Neural Networks (RNNs) to accomplish this task with some success. Other works have attempted to introduce Graph Neural Networks (GNNs) and redefine the task accordingly to achieve significant improvements. However, these efforts suffer from at least one of the following drawbacks: 1) they pay too much attention to details of the nodes rather than to high-level semantic information; 2) they struggle to effectively establish spatial associations and complex structures of the nodes; and 3) they represent either concepts or exercises only, without integrating them. Inspired by recent advances in self-supervised learning, we propose a Bi-Graph Contrastive Learning based Knowledge Tracing (Bi-CLKT) to address these limitations. Specifically, we design a two-layer comparative learning scheme based on an "exercise-to-exercise" (E2E) relational subgraph. It involves node-level contrastive learning of subgraphs to obtain discriminative representations of exercises, and graph-level contrastive learning to obtain discriminative representations of concepts. Moreover, we designed a joint contrastive loss to obtain better representations and hence better prediction performance. Also, we explored two different variants, using RNN and memory-augmented neural networks as the prediction layer for comparison to obtain better representations of exercises and concepts respectively. Extensive experiments on four real-world datasets show that the proposed Bi-CLKT and its variants outperform other baseline models.

show abstract

“…The elapsed time et strongly evidence a student's proficiency in knowledge and skills. 51 This time is converted to seconds and capped at 500 s. A d′-dimensional latent embedding vector for et k is computed as…”

Section: Learner Knowledge State Evolutionmentioning

confidence: 99%

Knowledge structure enhanced graph representation learning model for attentive knowledge tracing

Gan

Sun

2021

Int J of Intelligent Sys

View full text Add to dashboard Cite

Knowledge tracing (KT) is a fundamental personalizedtutoring technique for learners in online learning systems. Recent KT methods employ flexible deep neural network-based models that excel at this task. However, the adequacy of KT is still challenged by the sparseness of the learners' exercise data. To alleviate the sparseness problem, most of the exiting KT studies are performed at the skill-level rather than the question-level, as questions are often numerous and associated with much fewer skills. However, at the skill level, KT neglects the distinctive information related to the questions themselves and their relations. In this case, the models can imprecisely infer the learners' knowledge states and might fail to capture the long-term dependencies in the exercising sequences. In the knowledge domain, skills are naturally linked as a graph (with the edges being the prerequisite relations between pedagogical concepts). We refer to such a graph as a knowledge structure (KS). Incorporating a KS into the KT procedure can potentially resolve both the sparseness and information loss, but this avenue has been underexplored because obtaining the complete KS of a domain is challenging and laborintensive. In this paper, we propose a novel KS-enhanced graph representation learning model for KT with an attention mechanism (KSGKT). We first explore eight methods that automatically infer the domain KS from

show abstract

SAINT+: Integrating Temporal Features for EdNet Correctness Prediction

Cited by 92 publications

References 13 publications

Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of Hyperparameters and Metrics on Performance and Replicability

Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of Hyperparameters and Metrics on Performance and Replicability

Bi-CLKT: Bi-Graph Contrastive Learning based Knowledge Tracing

Knowledge structure enhanced graph representation learning model for attentive knowledge tracing

Contact Info

Product

Resources

About