2008
DOI: 10.1177/0265532207086780
|View full text |Cite
|
Sign up to set email alerts
|

Rater types in writing performance assessments: A classification approach to rater variability

Abstract: Research on rater effects in language performance assessments has provided ample evidence for a considerable degree of variability among raters. Building on this research, I advance the hypothesis that experienced raters fall into types or classes that are clearly distinguishable from one another with respect to the importance they attach to scoring criteria. To examine the rater type hypothesis, I asked 64 raters actively involved in scoring examinee writing performance on a large-scale assessment instrument … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
133
1
12

Year Published

2011
2011
2017
2017

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 207 publications
(151 citation statements)
references
References 39 publications
5
133
1
12
Order By: Relevance
“…One analysis that might shed light on some of the differences across topics would be a many-faceted Rasch analysis using the FACETS software (Linacre, 2010; see also Myford & Wolfe, 2003, 2004, for details of this method of analysis), which can be used to estimate rater severity and task difficulty on the same linear scale, allowing investigation of questions such as whether specific raters judge essays on certain topics more severely than others. This analysis could provide more detailed information about rater bias, and along with e-rater feature scores could complement recent research on the factors that influence rater behavior (e.g., Eckes, 2008).…”
Section: Implications and Future Directionsmentioning
confidence: 81%
See 1 more Smart Citation
“…One analysis that might shed light on some of the differences across topics would be a many-faceted Rasch analysis using the FACETS software (Linacre, 2010; see also Myford & Wolfe, 2003, 2004, for details of this method of analysis), which can be used to estimate rater severity and task difficulty on the same linear scale, allowing investigation of questions such as whether specific raters judge essays on certain topics more severely than others. This analysis could provide more detailed information about rater bias, and along with e-rater feature scores could complement recent research on the factors that influence rater behavior (e.g., Eckes, 2008).…”
Section: Implications and Future Directionsmentioning
confidence: 81%
“…One possible explanation for this result may be found in the research finding that essay raters do not base their scores strictly on the wording of a specific scale (see Eckes, 2008, for a recent review of the literature on rater behavior). For example, Lumley (2002) noted that raters' judgments seem to be based on "some complex and indefinable feeling about the text, rather than the scale content" and that raters form "a uniquely complex impression independently of the scale wordings."…”
Section: Discussionmentioning
confidence: 99%
“…Raters also judge students' writing ability differently depending on their academic background and sex (Vann, Lorenz & Meyer, 1991), and the training received (Weigle, 1994). Studies such as Cumming (1990), Eckes (2008), Esfandiari & Myford (2013), González & Roux (2013), Lim (2011), Shi (2001), Shi, Wan, & Wen (2003) and Wiseman (2012), describe how distinct rater backgrounds influence (or do not influence) their rating behavior, actual scores and scoring procedures. Lim (2011), for instance, focused on experienced and inexperienced raters.…”
Section: Introductionmentioning
confidence: 99%
“…Számos kutató (az íráskészség esetében például Weigle, 1998;Engelhard és Myford, 2003;Eckes, 2005Eckes, , 2008Schoonen, 2005; a beszédkészség esetében például Vidakovic és Galaczi, 2009) modellezte az értékelık tulajdonságait. E vizsgálatok tanulságai szerint az értékelık különbözhetnek az értéke-lési szempontok értelmezésében és alkalmazásában, a vizsgázók nyelvi teljesítményé-nek szigorú vagy enyhe megítélésében, az értékelési skálák megértésében és használa-tában, a különbözı készségszintő diákok értékelésének következetességében.…”
Section: A Kommunikatív Nyelvi Készségek éS Az Irt-modellekunclassified