1995
DOI: 10.1177/026553229501200206
|View full text |Cite
|
Sign up to set email alerts
|

Investigating variability in tasks and rater judgements in a performance test of foreign language speaking

Abstract: Much of the recent debate that has surrounded the development and use of 'performance', or 'communicative' language tests has focused on a supposed trade-off between two sets of desirable qualities: correspondence between test tasks and test performance to nontest language use for content relevance; and reliability of scores derived from test performance. One area that has been of particular concern with performance tests is the potential variability in tasks and rater judgements, and this has been investigate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
87
0
9

Year Published

1998
1998
2022
2022

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 147 publications
(98 citation statements)
references
References 11 publications
2
87
0
9
Order By: Relevance
“…The two univariate G-study designs showed that the largest variance component was in the test-takers themselves, indicating that most variability in ESL oral performance can be explained by test-takers' true ability, not by construct-irrelevant factors such as tasks or raters. The contribution of the interaction between test-takers and raters to the total score variance estimate was found to be substantial (approximately 5% to 6%), confirming the findings of Bachman, Lynch, and Mason (1995) and Lynch and McNamara (1998). This suggests that ESL teachers exhibited somewhat different severity patterns to a certain group of ESL learners.…”
Section: Discussionsupporting
confidence: 72%
“…The two univariate G-study designs showed that the largest variance component was in the test-takers themselves, indicating that most variability in ESL oral performance can be explained by test-takers' true ability, not by construct-irrelevant factors such as tasks or raters. The contribution of the interaction between test-takers and raters to the total score variance estimate was found to be substantial (approximately 5% to 6%), confirming the findings of Bachman, Lynch, and Mason (1995) and Lynch and McNamara (1998). This suggests that ESL teachers exhibited somewhat different severity patterns to a certain group of ESL learners.…”
Section: Discussionsupporting
confidence: 72%
“…Therefore, this study adopts the triangulation approach employed in other studies (e.g., Bachman, Lynch, & Mason, 1995;Lynch & McNamara, 1998) of combining manyfacet Rasch measurement with generalizability theory, while adding consideration of interrater score correlations as an additional source of information on scoring consistency.…”
Section: Resultsmentioning
confidence: 99%
“…It was thus apparent that Skehan and his colleagues' (Foster and Skehan 1996;Skehan and Foster 1997) observation that more interactive tasks lead to more complex language performance did not find support in the Bygate and Michel et al (2007) studies. In language testing contexts, a few studies (e.g., Fulcher 1996;Bachman et al 1995) reported significant but small differences in test scores across different types of test tasks. More recently, a number of studies conducted in experimental language testing settings that replicated Skehan's or Robinson's framework concerning the impact of task performance conditions on task performance revealed results that did not lend much support to either of the their theoretical frameworks.…”
Section: Hkeaa's Statistical Moderationmentioning
confidence: 99%