Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.de abrufbar.Dieses Werk ist urheberrechtlich geschützt. Die dadurch begründeten Rechte, insbesondere die der Übersetzung, des Nachdrucks, des Vortrags, der Entnahme von Abbildungen und Tabellen, der Funksendung, der Mikroverfilmung oder der Vervielfältigung auf anderen Wegen und der Speicherung in Datenverarbeitungsanlagen, bleiben, auch bei nur auszugsweiser Verwertung, vorbehalten. Eine Vervielfältigung dieses Werkes oder von Teilen dieses Werkes ist auch im Einzelfall nur in den Grenzen der gesetzlichen Bestimmungen des Urheberrechtsgesetzes der Bundesrepublik Deutschland vom 9. September 1965 in der jeweils geltenden Fassung zulässig. Sie ist grundsätzlich vergütungspflichtig. Zuwiderhandlungen unterliegen den Strafbestimmungen des Urheberrechtsgesetzes.
This article presents a formula for weighted kappa in terms of rater means, rater variances, and the rater covariance that is particularly helpful in emphasizing that weighted kappa is an absolute agreement measure in the sense that it is sensitive to differences in rater's marginal distributions. Specifically, rater mean differences will decrease the value of weighted kappa relative to the value of the intraclass correlation that ignores mean differences. In addition, if rater variances also differ, then the value of weighted kappa will be decreased relative to the value of the product-moment correlation. Equality constraints on the rater means and variances are given to illustrate the relationships between weighted kappa, the intraclass correlation, and the product-moment correlation. In addition, the expression for weighted kappa shows that weighted kappa belongs to the Zegers-ten Berge family of chance-corrected association coefficients. More specifically, weighted kappa is equivalent to the chance-corrected identity coefficient.If two raters assign the same targets to categories, the ratings can be arranged in a bivariate frequency table such as Table 1, where for the sake of concreteness three response categories have been assumed. In case categories are ordered along a continuum of values, it is desirable to give partial credit for near agreement. Because weighted kappa (Cohen, 1968) allows for differential weighting of disagreement, it is an attractive agreement statistic for ordered categories and preferable to Cohen's kappa (1960), which distinguishes only between agreement and disagreement cases.Generally, the gravity of a disagreement is related to the number of categories by which raters differ. One way to implement a weighting scheme that
Because of response disturbances such as guessing, cheating, or carelessness, item response models often can only approximate the ''true'' individual response probabilities. As a consequence, maximum-likelihood estimates of ability will be biased. Typically, the nature and extent to which response disturbances are present is unknown, and, therefore, accounting for them by altering the model is not possible. Even if the nature of the response disturbances were known, accounting for them by increasing model complexity could easily lead to sample size requirements for estimation purpose that would be difficult to achieve. An approach based on weighting the contributions of the item responses to the log likelihood function has been suggested by Mislevy and Bock. This estimation approach has been shown to effectively reduce bias of ability estimates in the presence of response disturbances. However, this approach is prone to produce infinite ability estimates for unexpected response patterns in which correct answers are sparse. An alternative robust estimator of ability is suggested that does not appear to produce infinite estimates. Limited simulation studies show that the two estimators are equivalent when evaluated in terms of mean squared error.
The rater agreement literature is complicated by the fact that it must accommodate at least two different properties of rating data: the number of raters (two versus more than two) and the rating scale level (nominal versus metric). While kappa statistics are most widely used for nominal scales, intraclass correlation coefficients have been preferred for metric scales. In this paper, we suggest a dispersion-weighted kappa framework for multiple raters that integrates some important agreement statistics by using familiar dispersion indices as weights for expressing disagreement. These weights are applied to ratings identifying cells in the traditional inter-judge contingency table. Novel agreement statistics can be obtained by applying less familiar indices of dispersion in the same way.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.