Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications - EANL '08 2008
DOI: 10.3115/1631836.1631847
|View full text |Cite
|
Sign up to set email alerts
|

Real-time web text classification and analysis of reading difficulty

Abstract: The automatic analysis and categorization of web text has witnessed a booming interest due to the increased text availability of different formats, content, genre and authorship. We present a new tool that searches the web and performs in real-time a) html-free text extraction, b) classification for thematic content and c) evaluation of expected reading difficulty. This tool will be useful to adolescent and adult low-level reading students who face, among other challenges, a troubling lack of reading material … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0
1

Year Published

2010
2010
2020
2020

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(20 citation statements)
references
References 9 publications
0
19
0
1
Order By: Relevance
“…Future empirical research will determine the predictive validity of Read-X and its usability in educational settings. Meanwhile, developers are planning to examine whether psycholinguistic and discourse processing factors like syntactic complexity, propositional density, and rhetorical structure may improve the program's readability analysis (Miltsakaki and Troutt 2008).…”
Section: Recent Traditional-style Methodsmentioning
confidence: 99%
“…Future empirical research will determine the predictive validity of Read-X and its usability in educational settings. Meanwhile, developers are planning to examine whether psycholinguistic and discourse processing factors like syntactic complexity, propositional density, and rhetorical structure may improve the program's readability analysis (Miltsakaki and Troutt 2008).…”
Section: Recent Traditional-style Methodsmentioning
confidence: 99%
“…A downside of the indirect approach, as can also be seen in the results of this study, is that the accuracy is not as good as in semi-manual systems. The method presented in this article can be used also, for example, in competence management (Honkela, Nordfors, & Tuuli, 2004;Volpentesta & Felicetti, 2011) or in a search engine for educational material for adolescents or younger pupils (Miltsakaki & Troutt, 2008). A broad scenario is to use the method to assess documents retrieved by a general-purpose search engine.…”
Section: Discussionmentioning
confidence: 98%
“…One approach to automate readability calculations is to select those formulas that do not require linguistic resources (e.g., Miltsakaki & Troutt, 2008;Petersen & Ostendorf, 2009;Crossley, Allen, & McNamara, 2011). Another possibility is to use extensive language-specific preprocessing or word lists (Feng, Elhadad, & Huenerfauth, 2009;François, 2009;Heilman, Collins-Thompson, & Eskenazi, 2008).…”
Section: Automatic Approaches For Text Difficulty Assessmentmentioning
confidence: 99%
“…Every classifier operates well on different aspects of the training or test feature vector. As a result, assuming appropriate conditions, combining multiple classifiers may improve classification performance when compared with any single classifier (Miltsakaki andTroutt, 2009 andWikipedia, 2013).…”
Section: Classifiersmentioning
confidence: 99%