Monitoring Faculty Consultant Performance in the Advanced Placement English Literature and Composition Program With a Many‐faceted Rasch Model

Engelhard, George; Myford, Carol M.

doi:10.1002/j.2333-8504.2003.tb01893.x

Cited by 53 publications

(61 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…It appears that the raters overall were more lenient toward test takers who had Spanish as an L1, and more severe toward test takers who had Korean or Chinese as an L1. However, it should be noted that other researchers posit that differences between subgroup performances of less than .30 logits are usually not substantively meaningful (Engelhard & Myford, 2003), thus suggesting that the differences found here between the Spanish (on the one hand) and Korean and Chinese (on the other hand) L1…”

Section: Figure 2 Variable Map From the Facets Analysis Of The Datacontrasting

confidence: 55%

The Relationship Between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples

Winke

Gass

Myford

2011

ETS Research Report Series

View full text Add to dashboard Cite

This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4‐hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that 72 test takers (native speakers of Chinese, Korean, and Spanish) produced. We analyzed the rating data using a multifaceted Rasch measurement approach to uncover potential biases in the rating process. In addition, 26 of the raters participated in stimulated recall sessions, during which they watched videos of themselves rating. Using the video as a prompt, we asked them to discuss and explain their rating processes at the time of rating. The results from our bias interaction analyses revealed that matches between the raters' L2 and the test takers' L1 resulted in some of the raters assigning ratings that were significantly higher than expected. As a whole, raters with Spanish as an L2 were significantly more lenient toward test takers who had Spanish as an L1, and raters with Chinese as an L2 were significantly more lenient toward test takers who had Chinese as an L1. Analyses of the qualitative data, assisted by the program QSR NVivo 8, revealed information concerning the raters' awareness of their biases.

show abstract

Section: Figure 2 Variable Map From the Facets Analysis Of The Datacontrasting

confidence: 55%

The Relationship Between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples

Winke

Gass

Myford

2011

ETS Research Report Series

View full text Add to dashboard Cite

show abstract

“…Besides, studies comparing methods used for determining inter-rater reliability based on different theories of measuring have an important place in recent research. These include studies comparing the methods based on the classical test theory, G theory, the many-facet Rasch measurement and the hierarchical rating model (Akın & Baştürk, 2010, 2012Engelhard, 1994;Engelhard & Myford, 2003;Güler & Gelbal, 2010b;Güler & Teker, 2015;Iramaneerart, Myford, Yudkowsky, & Lowenstein, 2009;Iramaneerat et al, 2008;Linacre et al, 1990;Lynch & McNamara, 1998;Macmillan, 2000;Nakamura, 2000;Stenlund, 2013;Sudweeks, Reeve, & Bradshaw, 2004). Further details are not provided in relation to the above mentioned studies since the present study aims at comparing rubrics and graded-category rating scales used in scoring rather than comparing the methods used to determine inter-rater reliability.…”

Section: Rubricsmentioning

confidence: 99%

A Comparison of Rubrics and Graded Category Rating Scales with Various Methods Regarding Raters’ Reliability

Doğan¹,

Uluman²

2017

EDUC SCI-THEOR PRACT

View full text Add to dashboard Cite

“…Számos kutató (az íráskészség esetében például Weigle, 1998;Engelhard és Myford, 2003;Eckes, 2005Eckes, , 2008Schoonen, 2005; a beszédkészség esetében például Vidakovic és Galaczi, 2009) modellezte az értékelık tulajdonságait. E vizsgálatok tanulságai szerint az értékelık különbözhetnek az értéke-lési szempontok értelmezésében és alkalmazásában, a vizsgázók nyelvi teljesítményé-nek szigorú vagy enyhe megítélésében, az értékelési skálák megértésében és használa-tában, a különbözı készségszintő diákok értékelésének következetességében.…”

Section: A Kommunikatív Nyelvi Készségek éS Az Irt-modellekunclassified

“…Ezt a modellilleszkedést az outfit paraméterekkel jellemezhetjük számszerően (Engelhard és Myford, 2003;Park, 2004) Az outfit paraméterek jellemzik az adatok modellilleszkedését, azaz a megfigyelt és az elvárt adatok illeszkedését az egyes értékelési pontszámoknál. Ideális esetben ez az érték 1,0-hez van közel.…”

Section: áBra Az Elsı Feladat Szókincs Kifejezésmód Thurstoni Küszöbunclassified

“…Ideális esetben ez az érték 1,0-hez van közel. Ha ez az érték 2,0-nél nagyobb, akkor azt jelölheti, hogy az adott pontszámhoz tartozó leírás nem mőködik megfelelıen, az értékelık a különbözı képességszintő vizsgázók esetén következetlenül használják az adott pontszámot (Engelhard és Myford, 2003). A formai jegyek és hangnem szemponton kívül mindkét feladatban a nyelvtan, helyesírás szempontnál találtunk 2,0-nél magasabb outfit paramé-tereket az 5. pontnál.…”

Section: áBra Az Elsı Feladat Szókincs Kifejezésmód Thurstoni Küszöbunclassified

See 1 more Smart Citation

Az idegen nyelvi érettségi működése és hatása a tanulói teljesítmények és a tanári nézetek tükrében

Vígh

View full text Add to dashboard Cite

Monitoring Faculty Consultant Performance in the Advanced Placement English Literature and Composition Program With a Many‐faceted Rasch Model

Cited by 53 publications

References 48 publications

The Relationship Between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples

The Relationship Between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples

A Comparison of Rubrics and Graded Category Rating Scales with Various Methods Regarding Raters’ Reliability

Az idegen nyelvi érettségi működése és hatása a tanulói teljesítmények és a tanári nézetek tükrében

Contact Info

Product

Resources

About