1996
DOI: 10.1007/bf02249609
|View full text |Cite
|
Sign up to set email alerts
|

Interrater reliability and agreement of performance ratings: A methodological comparison

Abstract: ABSTRAC~ This paper demonstrates and compares methods for estimating the interrater reliability and interrater agreement of performance ratings. These methods can be used by applied researchers to investigate the quality of ratings gathered, for example, as criteria for a validity study, or as performance measures for selection or promotional purposes. While estimates of interrater reliability are frequently used for these purposes, indices of interrater agreement appear to be rarely reported for performance r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0
1

Year Published

1998
1998
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 37 publications
(37 citation statements)
references
References 11 publications
0
35
0
1
Order By: Relevance
“…Three researchers independently reviewed and classified the quotes. This check-coding (Miles & Huberman, 1994) enabled us to assess the interrater reliability (Fleenor et al, 1996). After a few revisions to the code list, the three researchers' initial classifications matched at just above the 70% level.…”
Section: Data Collectionmentioning
confidence: 99%
“…Three researchers independently reviewed and classified the quotes. This check-coding (Miles & Huberman, 1994) enabled us to assess the interrater reliability (Fleenor et al, 1996). After a few revisions to the code list, the three researchers' initial classifications matched at just above the 70% level.…”
Section: Data Collectionmentioning
confidence: 99%
“…The literature most frequently recommends two approaches to inter-rater reliability: consensus and consistency. While consensus (agreement) measures if raters assign the same score, consistency provides a measure of correlation between the scores of raters (Fleenor, Fleenor, and Grossnickle 1996). Some studies use generalisability theory to compute measurement estimates.…”
Section: Reliability Of Rubricsmentioning
confidence: 99%
“…Results of the categorization were sent to interviewees for their approval and comments. Furthermore, to increase the internal validity of the data categorization, three researchers independently categorized the interviews (Fleenor et al, 1996). The categorization results of the independent researchers were approximately 80% identical with those of the original categorization and thus deemed to be fairly valid (see Table 1).…”
Section: Variables and Assessmentmentioning
confidence: 98%