2000
DOI: 10.1002/j.2333-8504.2000.tb01829.x
|View full text |Cite
|
Sign up to set email alerts
|

Monitoring Sources of Variability Within the Test of Spoken English Assessment System

Abstract: The purposes of this study were to examine four sources of variability within the Test of Spoken English (TSE®) assessment system, to quantify ranges of variability for each source, to determine the extent to which these sources affect examinee performance, and to highlight aspects of the assessment system that might suggest a need for change. Data obtained from the February and April 1997 TSE scoring sessions were analyzed using Facets (Linacre, 1999a). The analysis showed that, for each of the two TSE admini… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
36
0

Year Published

2000
2000
2017
2017

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 35 publications
(38 citation statements)
references
References 16 publications
2
36
0
Order By: Relevance
“…In conclusion, the findings of the present study concur with the previous studies by confirming that raters may be affected by factors other than the actual performance of the test-takers (e.g., Chalhoub-Deville, 1995;Chalhoub-Deville & Wigglesworth, 2005;Lumley & McNamara, 1995;Myford & Wolfe, 2000;Winke & Gass, 2012). Whether random or systematic, similar to the other studies, measurement error was observed in this study underlining the influential factors that may cause disagreement within and/or among the raters' judgments in oral performance assessments.…”
Section: Discussionsupporting
confidence: 91%
See 3 more Smart Citations
“…In conclusion, the findings of the present study concur with the previous studies by confirming that raters may be affected by factors other than the actual performance of the test-takers (e.g., Chalhoub-Deville, 1995;Chalhoub-Deville & Wigglesworth, 2005;Lumley & McNamara, 1995;Myford & Wolfe, 2000;Winke & Gass, 2012). Whether random or systematic, similar to the other studies, measurement error was observed in this study underlining the influential factors that may cause disagreement within and/or among the raters' judgments in oral performance assessments.…”
Section: Discussionsupporting
confidence: 91%
“…In other words, 75 % of the Total Scores assigned by these 15 raters ranked lower or higher in the post-test, ranging from one point difference to more than 10 points. As discussed by Myford and Wolfe (2000), one point may not seem like or be considered as a large difference, but it can have an important effect for the test takers whose scores are around borderline/pass score. Figure 2 below presents the results about the raters' behavior in terms of (a) whether there was a statistically significant difference between their pre-and post-test scores, and (b) whether they referred to the proficiency levels of the students in their think aloud protocols.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The extension of the Rasch model for analyses of assessor-mediated ratings is called the many-faceted Rasch model (FACETS model, Linacre, 1989). The FACETS model has been used to examine the psychometric quality of a variety of performance assessments based on assessor-mediated ratings (e.g., Engelhard, 1992Engelhard, , 1994Engelhard, , 1996Heller, Sheingold, & Myford, 1998;Linacre, Engelhard, Tatum, & Myford, 1994;Lunz & Stahl, 1990;Lunz, Wright, & Linacre, 1990;Myford, Marr, & Linacre, 1996;Myford & Mislevy, 1995;Myford & Wolfe, 2000a;Paulukonis, Myford, & Heller, 2000;Wolfe, Chiu, & Myford, 1999). It should be stressed that the FACETS model provides additional information that supplements, rather than supplants, the inferences provided by more traditional methods that have been used previously to analyze the NBPTS assessments.…”
mentioning
confidence: 99%