Coh-Metrix is among the broadest and most sophisticated automated textual assessment tools available today. Automated Evaluation of Text and Discourse with Coh-Metrix describes this computational tool, as well as the wide range of language and discourse measures it provides. Part I of the book focuses on the theoretical perspectives that led to the development of Coh-Metrix, its measures, and empirical work that has been conducted using this approach. Part II shifts to the practical arena, describing how to use Coh-Metrix and how to analyze, interpret, and describe results. Coh-Metrix opens the door to a new paradigm of research that coordinates studies of language, corpus analysis, computational linguistics, education, and cognitive science. This tool empowers anyone with an interest in text to pursue a wide array of previously unanswerable research questions.
In this study, a corpus of expert-graded essays, based on a standardized scoring rubric, is computationally evaluated so as to distinguish the differences between those essays that were rated as high and those rated as low. The automated tool, Coh-Metrix, is used to examine the degree to which high- and low-proficiency essays can be predicted by linguistic indices of cohesion (i.e., coreference and connectives), syntactic complexity (e.g., number of words before the main verb, sentence structure overlap), the diversity of words used by the writer, and characteristics of words (e.g., frequency, concreteness, imagability). The three most predictive indices of essay quality in this study were syntactic complexity (as measured by number of words before the main verb), lexical diversity (as measured by the Measure of Textual Lexical Diversity), and word frequency (as measured by Celex, logarithm for all words). Using 26 validated indices of cohesion from Coh-Metrix, none showed differences between high- and low-proficiency essays and no indices of cohesion correlated with essay ratings. These results indicate that the textual features that characterize good student writing are not aligned with those features that facilitate reading comprehension. Rather, essays judged to be of higher quality were more likely to contain linguistic features associated with text difficulty and sophisticated language.
The opinions of second language learning (L2) theorists and researchers are divided over whether to use authentic or simplified reading texts as the means of input for beginning-and intermediate-level L2 learners. Advocates of both approaches cite the use of linguistic features, syntax, and discourse structures as important elements in support of their arguments, but there has been no conclusive study that measures these differences and their implications for L2 learning. The purpose of this article is to provide an exploratory study that fills this gap. Using the computational tool Coh-Metrix, this study investigates the differences between the linguistic structures of sampled simplified texts and those of authentic reading texts in order to provide a better understanding of the linguistic features that comprise these text types. The findings demonstrate that these texts differ significantly, but not always in the manner supposed by the authors of relevant scholarship. This research is meant to enable material developers, publishers, and classroom teachers to judge more accurately the value of both authentic and simplified texts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.