2010
DOI: 10.1007/bf03216919
|View full text |Cite
|
Sign up to set email alerts
|

Using the method of pairwise comparison to obtain reliable teacher assessments

Abstract: Demands for accountability have seen the implementation of large scale testing programs in Australia and internationally. There is, however, a growing body of evidence to show that externally imposed testing programs do not have a sustained impact on student achievement. It has been argued that teacher assessment is more effective in raising student achievement levels. However, it is also often argued that teacher assessments are less reliable than the results of testing programs. This paper presents a study i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

4
69
0
3

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 79 publications
(76 citation statements)
references
References 22 publications
4
69
0
3
Order By: Relevance
“…CJ has been used in a range of educational research and practice (e.g., Bramley, 2007;Bramley, Bell & Pollitt, 1998;Heldsinger & Humphry, 2010;Seery, Canty & Phelan, 2012). …”
Section: Comparative Judgement (Cj)mentioning
confidence: 99%
See 2 more Smart Citations
“…CJ has been used in a range of educational research and practice (e.g., Bramley, 2007;Bramley, Bell & Pollitt, 1998;Heldsinger & Humphry, 2010;Seery, Canty & Phelan, 2012). …”
Section: Comparative Judgement (Cj)mentioning
confidence: 99%
“…Indeed the development of CJ for educational assessment has involved a diversity of disciplines ranging from design and technology (Kimbell, 2012) to narrative writing (Heldsinger & Humphry, 2010). Therefore, CJ may offer the potential to enable the assessment of rich and authentic educational outcomes in a wide variety of subject areas and contexts.…”
Section: Final Remarksmentioning
confidence: 99%
See 1 more Smart Citation
“…Pollitt posited that in Kimbell et al's (2009) study the high reliability coefficient of 0.96 generated by 28 judges assessing 352 e-portfolios with 3067 judgements was higher than any analytical marking system could achieve. Studies by Heldsinger and Humphry (2010), (2013) found that highly reliable comparative judgements of writing scripts by a large number of judges were made using calibrated exemplars as referents, as suggested by Thurstone (1928). They argued that comparative judgement, using calibrated exemplars, were successfully used as a highly reliable method to validate results from large-scale testing programs without the extensive assessor training and moderation processes that were required for absolute analytical judgements.…”
Section: Relative and Absolute Judgementsmentioning
confidence: 99%
“…Several researchers have found that judgements based on quality, using specified holistic criteria, resulted in a more valid assessment of performance (Bramley, 2007;Heldsinger & Humphry, 2010;Pollitt, 2012a). According to Pollitt (2012a) the high reliability of comparative judgement is a "consequence of the constant focus on validity" (p. 168) and this "demands and checks that a sufficient consensus exists amongst the pool of judges involved" (p. 167).…”
Section: Relative and Absolute Judgementsmentioning
confidence: 99%