Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.535
|View full text |Cite
|
Sign up to set email alerts
|

Neural Automated Essay Scoring Incorporating Handcrafted Features

Abstract: Automated essay scoring (AES) is the task of automatically assigning scores to essays as an alternative to grading by human raters. Conventional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks (DNNs) to obviate the need for feature engineering. Furthermore, hybrid methods that integrate handcrafted features in a DNN-AES model have been recently developed and have achieved state-of-the-art accuracy. One of the most popular hybrid method… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 55 publications
(30 citation statements)
references
References 33 publications
0
30
0
Order By: Relevance
“…Additionally, our model could be enhanced by multiple techniques found in other works such as adding domain-specific features [47,51] or recognizing question types or their difficulty [34], and utilizing specialized methods for each of them though on the downside this increases system complexity. It is especially interesting to leverage rubrics as defined by teachers.…”
Section: Discussion and Limitationsmentioning
confidence: 99%
“…Additionally, our model could be enhanced by multiple techniques found in other works such as adding domain-specific features [47,51] or recognizing question types or their difficulty [34], and utilizing specialized methods for each of them though on the downside this increases system complexity. It is especially interesting to leverage rubrics as defined by teachers.…”
Section: Discussion and Limitationsmentioning
confidence: 99%
“…Most modern NLP systems have started to use attention-based transformer networks and large pretrained language models. Yang et al (2020), Uto et al (2020) use the We have uploaded the rest of the requirements along with the code base-uncased (Devlin et al, 2019) pre-trained language model to perofrm automatic essay grading achieving QWKs in the range of 0.79 to 0.805. However, BERT has about 110 million parameters (compared to our largest model with just under 2 million parameters).…”
Section: Comparison With Transformer Modelsmentioning
confidence: 99%
“…Transformers are generally able to vastly outperform regression on engineered features. However, in some text labeling tasks, such as essay scoring, it has been shown that engineered features can be used in tandem with Transformer output to improve performance (Uto et al, 2020). This can be achieved simply by concatenating a vector of features f n to BERT's CLS vector.…”
Section: Sentence-level Featuresmentioning
confidence: 99%
“…Our set of features was inspired by (Uto et al, 2020), but we excluded the readability metrics because they are not as relevant for our task. Specifically, for text sample x n , we calculate the number of words, number of sentences, number of exclamation marks, question marks, and commas, average word length, average sentence length, the number of nouns, verbs, adjectives, and adverbs, and the number of stop words.…”
Section: Sentence-level Featuresmentioning
confidence: 99%