1981
DOI: 10.2466/pms.1981.52.1.315
|View full text |Cite
|
Sign up to set email alerts
|

Reminder: Reliability of Global Judgments

Abstract: Reliability as measured by the extent of agreement is often a problem for complex global judgments. Empirically, the use of multiple raters improved reliability consistent with predictions from the Spearman-Brown formula. Implications for the reliability of clinical diagnosis are suggested.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
16
0

Year Published

1987
1987
2013
2013

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 24 publications
(17 citation statements)
references
References 7 publications
1
16
0
Order By: Relevance
“…Analysis of variance was used to estimate the reliability of the multiple rater technique. The mean of the raters' judgments for each child was highly reliable, R = .98, F (5, 420) = 57.95, p < .01, which conforms to requirements for the use of multiple judges (Aiken, 1985;Seiz, 1982;Roff, 1981;Winer, 1971). A test for order of task presentation to raters showed no reliable effect, t (4) < 1.…”
Section: Resultssupporting
confidence: 63%
See 1 more Smart Citation
“…Analysis of variance was used to estimate the reliability of the multiple rater technique. The mean of the raters' judgments for each child was highly reliable, R = .98, F (5, 420) = 57.95, p < .01, which conforms to requirements for the use of multiple judges (Aiken, 1985;Seiz, 1982;Roff, 1981;Winer, 1971). A test for order of task presentation to raters showed no reliable effect, t (4) < 1.…”
Section: Resultssupporting
confidence: 63%
“…The statistical safety inherent in large numbers is perhaps obvious, but the uniformity of the raters' judgments must be X _____ _____ _____ _____ _____ assessed to confirm the validity of this approach in determining instructional efficacy (cf. Aiken, 1985;Roff, 1981;Seiz, 1982). It may be argued that untrained raters are less subject to bias when judging the performance of simple skills; in any event our purpose was to identify obvious differences between training strategies that all observers could agree upon.…”
Section: Instructions To Ratersmentioning
confidence: 99%
“…We created for this study four empirically derived symptom scales from the symptom section of the coding book. We chose all those condensed symptom ratings that had both .70 Spearman-Brown interrater reliability (cf., Roff, 1981) and >lo% prevalence in the sample. We computed a principal components analysis with rotation to varimax on the 19 symptoms that reached these criteria.…”
Section: Methodsmentioning
confidence: 78%
“…Reliability coefficients were calculated on the preconsensus, independent rat- ings of all the coding variables and scales. They were then transformed by the Spearman-Brown prophesy formula to provide the estimated reliabilities for the consensed ratings (cf., Roff, 1981). For all the variables discussed in this paper, the reliabilities of the consensed ratings ranged from 0.42 to 0.99, with a mean of 0.89.…”
Section: Methodsmentioning
confidence: 85%
“…The interrater reliability was .79 ( n = 177) for Total PCL-R scores; .72 ( n = 148) for Facet 1; .59 ( n = 169) for Facet 2; .57 ( n = 177) for Facet 3; and .78 ( n = 179) for Facet 4. To increase reliability, all dual-rated PCL-R scores were averaged across raters (Epstein, 1980; Roff, 1981). …”
Section: Studymentioning
confidence: 99%