Automatic Essay Scoring Incorporating Rating Schema via Reinforcement Learning

Wang, Yucheng; Wei, Zhongyu; Zhou, Yiming; Huang, Xuanjing

doi:10.18653/v1/d18-1090

Cited by 57 publications

(29 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• RL1 Wang et al (2018b) proposed a reinforcement learning based model. In that paper, QWK is used as the reward function, and classification is used to gain the scores.…”

Section: Baselines and Implementation Detailsmentioning

confidence: 99%

See 1 more Smart Citation

Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Yang¹,

Cao²,

Wen³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Automated Essay Scoring (AES) is a critical text regression task that automatically assigns scores to essays based on their writing quality. Recently, the performance of sentence prediction tasks has been largely improved by using Pre-trained Language Models via fusing representations from different layers, constructing an auxiliary sentence, using multitask learning, etc. However, to solve the AES task, previous works utilize shallow neural networks to learn essay representations and constrain calculated scores with regression loss or ranking loss, respectively. Since shallow neural networks trained on limited samples show poor performance to capture deep semantic of texts. And without an accurate scoring function, ranking loss and regression loss measures two different aspects of the calculated scores. To improve AES's performance, we find a new way to fine-tune pre-trained language models with multiple losses of the same task. In this paper, we propose to utilize a pretrained language model to learn text representations first. With scores calculated from the representations, mean square error loss and the batch-wise ListNet loss with dynamic weights constrain the scores simultaneously. We utilize Quadratic Weighted Kappa to evaluate our model on the Automated Student Assessment Prize dataset. Our model outperforms not only state-of-the-art neural models near 3 percent but also the latest statistic model. Especially on the two narrative prompts, our model performs much better than all other state-of-theart models.

show abstract

“…• RL1 Wang et al (2018b) proposed a reinforcement learning based model. In that paper, QWK is used as the reward function, and classification is used to gain the scores.…”

Section: Baselines and Implementation Detailsmentioning

confidence: 99%

“…Reinforcement learning based models are also possible solutions. Wang et al (2018b) utilized dilated LSTM to learn text representations.…”

Section: Introductionmentioning

confidence: 99%

Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Yang¹,

Cao²,

Wen³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

show abstract

“…Our work differs from prior efforts primarily in the particular architecture that we use. Most prior work uses LSTMs (Farag et al, 2018;Wang et al, 2018;Cummins and Rei, 2018) or a combination LSTMs and CNNs (Taghipour and Ng, 2016;Zhang and Litman, 2018), cast as linear or logistic regression problems. In contrast, we use a hierarchically structured model with ordinal regression.…”

Section: Related Workmentioning

confidence: 99%

“…These problems (and the success of deep learning in other areas of language processing) have led to the development of neural methods for automatic essay scoring, moving away from feature engineering. A variety of studies (mostly LSTM-based) have reported AES performance comparable to or better than feature-based models (Taghipour and Ng, 2016;Cummins and Rei, 2018;Wang et al, 2018;Jin et al, 2018;Farag et al, 2018;Zhang and Litman, 2018). However, the current state-of-the-art models still use a combination of neural models and hand-crafted features (Liu et al, 2019).…”

Section: Introductionmentioning

confidence: 99%

Automated Essay Scoring with Discourse-Aware Neural Models

Nadeem¹,

Nguyen²,

Liu³

et al. 2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

Automated essay scoring systems typically rely on hand-crafted features to predict essay quality, but such systems are limited by the cost of feature engineering. Neural networks offer an alternative to feature engineering, but they typically require more annotated data. This paper explores network structures, contextualized embeddings and pre-training strategies aimed at capturing discourse characteristics of essays. Experiments on three essay scoring tasks show benefits from all three strategies in different combinations, with simpler architectures being more effective when less training data is available.

show abstract

“…Recent work on automated essay scoring has largely focused on holistic scoring, which summarizes the quality of an essay with a single score (e.g., Taghipour and Ng (2016), Dong et al (2017), Wang et al (2018)). There are at least two reasons for this focus.…”

Section: Introductionmentioning

confidence: 99%

Give Me More Feedback II: Annotating Thesis Strength and Related Attributes in Student Essays

Ke¹,

Inamdar²,

Lin³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

While the vast majority of existing work on automated essay scoring has focused on holistic scoring, researchers have recently begun work on scoring specific dimensions of essay quality. Nevertheless, progress in dimensionspecific essay scoring research is hindered in part by the lack of annotated corpora. To facilitate advances in this area of research, we design a rubric for scoring an important, yet unexplored dimension of persuasive essay quality, thesis strength, and annotate a corpus of essays with thesis strength scores. We additionally identify the attributes that could impact thesis strength and annotate the essays with the values of these attributes, which, when predicted by computational models, could provide feedback to students on why her essay receives a particular thesis strength score.

show abstract

Automatic Essay Scoring Incorporating Rating Schema via Reinforcement Learning

Cited by 57 publications

References 17 publications

Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking

Automated Essay Scoring with Discourse-Aware Neural Models

Give Me More Feedback II: Annotating Thesis Strength and Related Attributes in Student Essays

Contact Info

Product

Resources

About