2011
DOI: 10.1002/j.2333-8504.2011.tb02266.x
|View full text |Cite
|
Sign up to set email alerts
|

The Relationship Between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples

Abstract: This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4‐hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that 72 test takers (native speakers of Chinese, Korean, and Spanish) produced. We analyzed the rating data using a multifaceted Rasch measurement approach to uncover pote… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
15
0
2

Year Published

2012
2012
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 73 publications
2
15
0
2
Order By: Relevance
“…This finding seems to support Hsieh's suggestion that the differences she found between professional and non-professional raters are due to differences in prior exposure to L2 speech rather than to their professional field (Hsieh, 2011, p. 63), thus supporting findings of earlier studies regarding familiarity with L2-accented speech (Derwing & Munro, 1997;Gass & Varonis, 1984;Rubin, 1992;Winke, Gass, & Myford, 2011). However, in our study both rater groups maintained abundant contact with L2 speakers, and therefore exposure to L2 speech does not seem to be a likely explanation.…”
Section: Research Questionssupporting
confidence: 84%
“…This finding seems to support Hsieh's suggestion that the differences she found between professional and non-professional raters are due to differences in prior exposure to L2 speech rather than to their professional field (Hsieh, 2011, p. 63), thus supporting findings of earlier studies regarding familiarity with L2-accented speech (Derwing & Munro, 1997;Gass & Varonis, 1984;Rubin, 1992;Winke, Gass, & Myford, 2011). However, in our study both rater groups maintained abundant contact with L2 speakers, and therefore exposure to L2 speech does not seem to be a likely explanation.…”
Section: Research Questionssupporting
confidence: 84%
“…Winke et al, 2011. The most informative and important piece of output from Facets analyses is the variable map, which summarizes the key information of each facet and grouping facet into one figure.…”
Section: Findings and Discussionmentioning
confidence: 99%
“…Yet, sometimes even though the rubrics used are appropriate for the goals of the tests, raters may behave differently both in their own scoring processes and from each other while conducting the interviews, interacting with the test-takers and assessing the test-takers' performances. As a result, if raters are affected by some construct-irrelevant factors during the rating process, it is highly possible that they can misjudge the performance of test-takers which can lead to the misinterpretation of scores (Winke, Gass & Myford, 2011). In other words, rater measurement error, that is, "the variance in scores on a test that is not directly related to the purpose of the test" (Brown, 1996, p.188), can result in a lower score than a test-taker really deserves, which in some cases even lead to failing a test.…”
Section: Introductionmentioning
confidence: 99%
“…Previous studies have investigated rater effects on oral test scores from different perspectives such as the raters' educational and professional experience (e.g., Chalhoub-Deville, 1995), raters' nationality and native language (e.g., Chalhoub-Deville & Wigglesworth, 2005;Winke & Gass, 2012;Winke et al, 2011), rater training (e.g., Lumley & McNamara, 1995;Myford & Wolfe, 2000), and the gender of candidates and/or interviewers (e.g., O'Loughlin, 2002;O'Sullivan, 2000). For instance, Lumley and McNamara (1995) examined the effect of rater training on the stability of rater characteristics and rater bias whereas MacIntyre, Noels, and Clément (1997) examined bias in self-ratings in terms of participants' perceived competence in an L2 in relation to their actual competence and language anxiety.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation