Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016
DOI: 10.18653/v1/d16-1192
|View full text |Cite
|
Sign up to set email alerts
|

Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context

Abstract: The problem of accurately predicting relative reading difficulty across a set of sentences arises in a number of important natural language applications, such as finding and curating effective usage examples for intelligent language tutoring systems. Yet while significant research has explored documentand passage-level reading difficulty, the special challenges involved in assessing aspects of readability for single sentences have received much less attention, particularly when considering the role of surround… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 25 publications
0
14
0
Order By: Relevance
“…along with their preceding or following sentence) rather than in isolation. More closely related to our study is the one by Schumacher et al (2016) on readability assessment. In that work, authors gathered pairwise evaluations of reading difficulty on sentences presented with and without a larger context, training a logistic regression model to predict binary complexity labels assigned by humans.…”
Section: Introductionmentioning
confidence: 67%
“…along with their preceding or following sentence) rather than in isolation. More closely related to our study is the one by Schumacher et al (2016) on readability assessment. In that work, authors gathered pairwise evaluations of reading difficulty on sentences presented with and without a larger context, training a logistic regression model to predict binary complexity labels assigned by humans.…”
Section: Introductionmentioning
confidence: 67%
“…Among the shallow models we use Naive Bayes (NB), Logistic Regression (LR), Support Vector Machines (SVM) and Random Forests (RF) classifiers trained with unigrams, bigrams and trigrams as features. We also train the classifiers using the lexical and syntactic features proposed in (Schumacher et al, 2016) combined with the n-gram features (denoted as "enriched features"). We include neural network models such as word and char-level Long Short-Term Memory Network (LSTM) and Convolutional Neural Networks (CNN).…”
Section: Candidate Modelsmentioning
confidence: 99%
“…Traditionally, measuring the level of reading difficulty is done through lexicon and rule-based metrics such as the age of acquisition lexicon (AoA) (Kuperman et al, 2012) and the Flesch-Kincaid Grade Level (Kincaid et al, 1975). A machine learning based approach in (Schumacher et al, 2016) extracts lexical, syntactic, and discourse features and train logistic regression classifiers to predict the relative complexity of a single sentence in a pairwise setting. The most predictive features are simple representations based on AoA norms.…”
Section: Readability Assessmentmentioning
confidence: 99%
“…Step 1: Per-sentence reading difficulty estimation. The precise estimation of sentence-level readability is a hard problem and has recently attracted the attention of many researchers (Pilán, Volodina, & Johansson, 2014;Schumacher, Eskenazi, Frishkoff, & Collins-Thompson, 2016;Vajjala & Meurers, 2014). For efficiency, we use heuristic functions to make a rough estimation.…”
Section: The Coupled Bag-of-words Modelmentioning
confidence: 99%
“…For efficiency, we use heuristic functions to make a rough estimation. Specifically, we consider the linguistic features designed for readability assessment that have been demonstrated to be effective in previous studies (Feng et al, 2010;Schumacher et al, 2016), and choose the most used linguistic features that can be operated at the sentence level to build the heuristic functions. In total, eight heuristic functions h 2 {len, ans, anc, lv, art, ntr, pth, anp} corresponding to eight distinct features from three aspects are used to compute the reading score of a sentence, as shown in Table 1.…”
Section: The Coupled Bag-of-words Modelmentioning
confidence: 99%