2013
DOI: 10.1002/sim.5848
|View full text |Cite
|
Sign up to set email alerts
|

Correcting for rater bias in scores on a continuous scale, with application to breast density

Abstract: Existing literature on inter-rater reliability focuses on quantifying the disagreement between raters. In this paper, we introduce a method to correct for inter-rater disagreement (or observer bias), where raters are assigning scores on a continuous scale. To do this, we propose a two-stage approach. In the first stage, we standardise the distributions of rater scores to account for each rater's subjective interpretation of the continuous scale. In the second stage, we correct for case-mix differences between … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
2
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 39 publications
0
12
0
Order By: Relevance
“…Observer measurement of breast density has been shown in other studies to suffer from interobserver variability 123,124,131,132 and a recent paper has attempted to provide a correction for differences between observers. 133 From our results it is clear that observers are able to discriminate different densities in subjects with low automated breast densities. However, when examining the histograms, shown in Figure 14, of the area-based measurements from both human observers and software analysis, there is clearly a difference in the distribution of scores.…”
Section: Breast Density and Cancer Riskmentioning
confidence: 57%
“…Observer measurement of breast density has been shown in other studies to suffer from interobserver variability 123,124,131,132 and a recent paper has attempted to provide a correction for differences between observers. 133 From our results it is clear that observers are able to discriminate different densities in subjects with low automated breast densities. However, when examining the histograms, shown in Figure 14, of the area-based measurements from both human observers and software analysis, there is clearly a difference in the distribution of scores.…”
Section: Breast Density and Cancer Riskmentioning
confidence: 57%
“…More follow-up is needed to help to address this issue. Thirdly, the visually assessed score required human judgement, which might make it unreliable for routine use in a screening program [ 31 ], although the same applies to BI-RADS density. Exploration of automated methods is ongoing in a subset of the cohort.…”
Section: Discussionmentioning
confidence: 99%
“…VAS has been used in large clinical studies [30,31,32,33,34] and is considered preferable to some of the thresholding-based methods (described below) as it is less laborious and does not require specific reader training.…”
Section: Visual Methodsmentioning
confidence: 99%