2012
DOI: 10.1017/s1351324912000344
|View full text |Cite
|
Sign up to set email alerts
|

Using the crowd for readability prediction

Abstract: While human annotation is crucial for many natural language processing tasks, it is often very expensive and time-consuming. Inspired by previous work on crowdsourcing we investigate the viability of using non-expert labels instead of gold standard annotations from experts for a machine learning approach to automatic readability prediction. In order to do so, we evaluate two di↵erent methodologies to assess the readability of a wide variety of text material: a more traditional set-up in which expert readers ma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
37
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 43 publications
(39 citation statements)
references
References 31 publications
2
37
0
Order By: Relevance
“…Our results are consistent with the findings of De Clercq et al [17] for generic texts in Dutch. In their study the crowd was presented with pairwise comparisons.…”
Section: Discussionsupporting
confidence: 94%
“…Our results are consistent with the findings of De Clercq et al [17] for generic texts in Dutch. In their study the crowd was presented with pairwise comparisons.…”
Section: Discussionsupporting
confidence: 94%
“…From this, an expert ranking was created, using the midpoint of each expertassigned range. The correlation between the expert sentence ranking and the crowd ranking can be seen in Table 6, reinforcing the finding that crowdsourced judgments can provide an accurate ranking of difficulty (De Clercq et al, 2014).…”
Section: Review Of Datasupporting
confidence: 70%
“…We compared our model to the entity graph and to the entity grid (Barzilay and Lapata, 2008) and showed that normalization improves the results significantly in most tasks. Future work will include adding more linguistic information, stronger weighting schemes and application to other readability datasets (Pitler and Nenkova, 2008;De Clercq et al, 2014).…”
Section: Resultsmentioning
confidence: 99%