We assessed whether and how the discourse written for prototype integrated tasks (involving writing in response to print or audio source texts) field tested for Next Generation TOEFL® differs from the discourse written for independent essays (i.e., the TOEFL Essay®). We selected 216 compositions written for 6 tasks by 36 examinees in a field test-representing score levels 3, 4, and 5 on the TOEFL Essay-then coded the texts for lexical and syntactic complexity, grammatical accuracy, argument structure, orientations to evidence, and verbatim uses of source text. Analyses with non-parametric MANOVAs followed a 3 (task type: TOEFL Essay, writing in response to a reading passage, writing in response to a listening passage) by 3 (English proficiency level: score levels 3, 4, and 5 on the
This article documents 3 coordinated, exploratory studies that developed empirically a framework to describe the decisions that experienced writing assessors make when evaluating ESL/EFL written compositions. The studies are part of ongoing research to prepare a new scoring scheme and tasks for the writing component of the Test of English as a Foreign Language (TOEFL). In Study 1 a research team of 10 experienced ESL/EFL raters developed a preliminary descriptive framework from their own think‐aloud protocols while each rating (without any predefined scoring criteria) 60 TOEFL essays at 6 different score points on 4 different essay topics. Study 2 applied the framework to verbal report data from 7 highly experienced English‐mother‐tongue (EMT) composition raters while each rated 40 TOEFL essays. In Study 3 we refined the framework by analyzing think‐aloud protocols from 7 of the same ESL/EFL raters who rated compositions from 6 ESL students on 5 different writing tasks involving writing in response to reading or listening material. In each study, participants completed a questionnaire to profile their individual characteristics and relevant background variables. In addition to documenting and analyzing in detail the thinking processes of these raters, we found that both groups of raters used similar decision‐making behaviors, in similar proportions of frequency, while assessing both the TOEFL essays and the new writing tasks, thus verifying the appropriateness of our descriptive framework. Raters attended more extensively to rhetoric and ideas (compared to language) in compositions they scored high than in compositions they scored low. The ESL/EFL raters attended more extensively, though, to language than to rhetoric and ideas overall, whereas the EMT raters balanced more evenly their attention to these main features of the written compositions. Most participants perceived that their previous experiences rating compositions and teaching English had influenced their criteria and their processes for rating the compositions.
We assessed whether and how the discourse written for prototype integrated tasks (involving writing in response to print or audio source texts) field tested for the new TOEFL® differs from the discourse written for independent essays (i.e., the TOEFL essay). We selected 216 compositions written for 6 tasks by 36 examinees in a field test—representing Score Levels 3, 4, and 5 on the TOEFL essay—then coded the texts for lexical and syntactic complexity, grammatical accuracy, argument structure, orientations to evidence, and verbatim uses of source text. Analyses with nonparametric MANOVAs, following a 3‐by‐3 (task type by English proficiency level) within‐subjects factorial design, showed that the discourse produced for the integrated writing tasks differed significantly at the lexical, syntactic, rhetorical, and pragmatic levels from the discourse produced in the independent essay on most of these variables. In certain analyses, these differences were also obtained across the 3 ESL proficiency levels.
Writing tasks assigned in 162 undergraduate and graduate courses in several disciplines at eight universities were collected. Using a sample of the assignments, key dimensions of difference were identified, and a classification scheme based on those dimensions was developed. Application of the classification scheme provided data on the prevalence of various types of assignments and, for essay tasks, showed the degree to which the assignments were characterized by each of several features. Differences in the kinds of writing tasks assigned in different groups of disciplines were examined.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.