2021
DOI: 10.1007/978-3-030-86517-7_23
|View full text |Cite
|
Sign up to set email alerts
|

Feature Enhanced Capsule Networks for Robust Automatic Essay Scoring

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 21 publications
2
3
0
Order By: Relevance
“…For example, LLMs-based innovations have achieved high scores of quadratic weighted kappa (QWK) in essay scoring, specifically for off-topic (QWK = 0.80), gibberish (QWK = 0.80), and paraphrased answers (QWK = 0.94), indicating substantial to almost perfect agreements with human raters (Doewes & Pechenizkiy, 2021). Similar performances on essay scoring have been observed in several other studies (eg, 0.80 QWK in (Beseiso et al, 2021) and 0.81 QWK in (Sharma et al, 2021)). Likewise, LLMs-based innovations' performances on automatic short-answer grading were also highly correlated with human ratings (Pearson's correlation between 0.75 to 0.82) (Ahmed et al, 2022;Sawatzki et al, 2022).…”
Section: Performancesupporting
confidence: 88%
“…For example, LLMs-based innovations have achieved high scores of quadratic weighted kappa (QWK) in essay scoring, specifically for off-topic (QWK = 0.80), gibberish (QWK = 0.80), and paraphrased answers (QWK = 0.94), indicating substantial to almost perfect agreements with human raters (Doewes & Pechenizkiy, 2021). Similar performances on essay scoring have been observed in several other studies (eg, 0.80 QWK in (Beseiso et al, 2021) and 0.81 QWK in (Sharma et al, 2021)). Likewise, LLMs-based innovations' performances on automatic short-answer grading were also highly correlated with human ratings (Pearson's correlation between 0.75 to 0.82) (Ahmed et al, 2022;Sawatzki et al, 2022).…”
Section: Performancesupporting
confidence: 88%
“…For example, LLMsbased innovations have achieved high scores of quadratic weighted kappa (QWK) in essay scoring, specifically for off-topic (QWK = 0.80), gibberish (QWK = 0.80), and paraphrased answers (QWK = 0.94), indicating substantial to almost perfect agreements with human raters [15]. Similar performances on essay scoring have been observed in several other studies (e.g., 0.80 QWK in [6] and 0.81 QWK in [56]). Likewise, LLMs-based innovations' performances on automatic short-answer grading were also highly correlated with human ratings (Pearson's correlation between 0.75 to 0.82) [2,46].…”
Section: Practical Challenges -Rq2supporting
confidence: 66%
“…The early stage of AES/AWE development is dominated by the traditional machine learning technologies based on Bayes' theorem (Rudner & Liang, 2002), linear regression (Phandi et al, 2015), rank preference learning (Chen & He, 2013;Yannakoudakis et al, 2011), reinforcement learning (Wang et al, 2015), etc. In recent years, the accelerated development of deep learning, especially neural network technology, has also greatly benefited the field of writing evaluation, such as Recurrent Neural Networks (Cai, 2019), Long Short-term Memory (Alikaniotis et al, 2016;Jin et al, 2018;Liu et al, 2019;Taghipour & Ng, 2016), Convolutional Neural Networks (CNN) (Dong & Zhang, 2016;Dong et al, 2017;Farag et al, 2017), BERT-based approach (Sharma et al, 2021). The introduction of attention mechanisms also significantly improved AES SOTA performance (Dong et al, 2017).…”
Section: Literature Reviewmentioning
confidence: 99%