Predicting the Relative Difficulty of Single Sentences With and
            Without Surrounding Context

Schumacher, Elliot; Eskénazi, Maxine; Frishkoff, Gwen A.; Collins-Thompson, Kevyn

doi:10.18653/v1/d16-1192

Cited by 14 publications

(14 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…along with their preceding or following sentence) rather than in isolation. More closely related to our study is the one by Schumacher et al (2016) on readability assessment. In that work, authors gathered pairwise evaluations of reading difficulty on sentences presented with and without a larger context, training a logistic regression model to predict binary complexity labels assigned by humans.…”

Section: Introductionmentioning

confidence: 67%

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

2021

View full text Add to dashboard Cite

Word concreteness and imageability have proven crucial in understanding how humans process and represent language in the brain. While word-embeddings do not explicitly incorporate the concreteness of words into their computations, they have been shown to accurately predict human judgments of concreteness and imageability. Inspired by the recent interest in using neural activity patterns to analyze distributed meaning representations, we first show that brain responses acquired while human subjects passively comprehend natural stories can significantly distinguish the concreteness levels of the words encountered. We then examine for the same task whether the additional perceptual information in the brain representations can complement the contextual information in the word-embeddings. However, the results of our predictive models and residual analyses indicate the contrary. We find that the relevant information in the brain representations is a subset of the relevant information in the contextualized wordembeddings, providing new insight into the existing state of natural language processing models.

show abstract

Section: Introductionmentioning

confidence: 67%

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

2021

View full text Add to dashboard Cite

show abstract

“…Among the shallow models we use Naive Bayes (NB), Logistic Regression (LR), Support Vector Machines (SVM) and Random Forests (RF) classifiers trained with unigrams, bigrams and trigrams as features. We also train the classifiers using the lexical and syntactic features proposed in (Schumacher et al, 2016) combined with the n-gram features (denoted as "enriched features"). We include neural network models such as word and char-level Long Short-Term Memory Network (LSTM) and Convolutional Neural Networks (CNN).…”

Section: Candidate Modelsmentioning

confidence: 99%

“…Traditionally, measuring the level of reading difficulty is done through lexicon and rule-based metrics such as the age of acquisition lexicon (AoA) (Kuperman et al, 2012) and the Flesch-Kincaid Grade Level (Kincaid et al, 1975). A machine learning based approach in (Schumacher et al, 2016) extracts lexical, syntactic, and discourse features and train logistic regression classifiers to predict the relative complexity of a single sentence in a pairwise setting. The most predictive features are simple representations based on AoA norms.…”

Section: Readability Assessmentmentioning

confidence: 99%

Explainable Prediction of Text Complexity: The Missing Preliminaries for Text Simplification

Gârbacea¹,

Guo²,

Carton³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Text simplification reduces the language complexity of professional content for accessibility purposes. End-to-end neural network models have been widely adopted to directly generate the simplified version of input text, usually functioning as a blackbox. We show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process. The first two steps in this pipeline are often neglected: 1) to predict whether a given piece of text needs to be simplified, and 2) if yes, to identify complex parts of the text. The two tasks can be solved separately using either lexical or deep learning methods, or solved jointly. Simply applying explainable complexity prediction as a preliminary step, the out-ofsample text simplification performance of the state-of-the-art, black-box simplification models can be improved by a large margin.

show abstract

“…Step 1: Per-sentence reading difficulty estimation. The precise estimation of sentence-level readability is a hard problem and has recently attracted the attention of many researchers (Pilán, Volodina, & Johansson, 2014;Schumacher, Eskenazi, Frishkoff, & Collins-Thompson, 2016;Vajjala & Meurers, 2014). For efficiency, we use heuristic functions to make a rough estimation.…”

Section: The Coupled Bag-of-words Modelmentioning

confidence: 99%

“…For efficiency, we use heuristic functions to make a rough estimation. Specifically, we consider the linguistic features designed for readability assessment that have been demonstrated to be effective in previous studies (Feng et al, 2010;Schumacher et al, 2016), and choose the most used linguistic features that can be operated at the sentence level to build the heuristic functions. In total, eight heuristic functions h 2 {len, ans, anc, lv, art, ntr, pth, anp} corresponding to eight distinct features from three aspects are used to compute the reading score of a sentence, as shown in Table 1.…”

Section: The Coupled Bag-of-words Modelmentioning

confidence: 99%

GRAW+: A two‐view graph propagation method with word coupling for readability assessment

Jiang

Yin

et al. 2019

Asso for Info Science & Tech

View full text Add to dashboard Cite

Existing methods for readability assessment usually construct inductive classification models to assess the readability of singular text documents based on extracted features, which have been demonstrated to be effective. However, they rarely make use of the interrelationship among documents on readability, which can help increase the accuracy of readability assessment. In this article, we adopt a graphbased classification method to model and utilize the relationship among documents using the coupled bag-of-words model. We propose a word coupling method to build the coupled bag-of-words model by estimating the correlation between words on reading difficulty. In addition, we propose a two-view graph propagation method to make use of both the coupled bag-of-words model and the linguistic features. Our method employs a graph merging operation to combine graphs built according to different views, and improves the label propagation by incorporating the ordinal relation among reading levels. Experiments were conducted on both English and Chinese data sets, and the results demonstrate both effectiveness and potential of the method.

show abstract

Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context

Cited by 14 publications

References 25 publications

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Explainable Prediction of Text Complexity: The Missing Preliminaries for Text Simplification

GRAW+: A two‐view graph propagation method with word coupling for readability assessment

Contact Info

Product

Resources

About