2014
DOI: 10.1177/0962280214537392
|View full text |Cite
|
Sign up to set email alerts
|

Statistical issues in the comparison of quantitative imaging biomarker algorithms using pulmonary nodule volume as an example

Abstract: Quantitative imaging biomarkers (QIBs) are being used increasingly in medicine to diagnose and monitor patients' disease. The computer algorithms that measure QIBs have different technical performance characteristics. In this paper we illustrate the appropriate statistical methods for assessing and comparing the bias, precision, and agreement of computer algorithms. We use data from three studies of pulmonary nodules. The first study is a small phantom study used to illustrate metrics for assessing repeatabili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
68
0
1

Year Published

2015
2015
2020
2020

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 57 publications
(70 citation statements)
references
References 22 publications
1
68
0
1
Order By: Relevance
“…However, when normality is assumed, the CCC equals the ICC [34,36,37]. Although we report the CCC here, this measure does suffer from notable deficiencies common to many of these correlation measures [39][40][41] in that they are very sensitive to sample heterogeneity and that they are aggregate measures (thus making it difficult to separate systematic bias from issues in precision or large random errors). Thus, it would not be valid to compare our CCC measures to those measured on a different set of nodules with a different range of volumes.…”
Section: ] Precisionmentioning
confidence: 97%
See 1 more Smart Citation
“…However, when normality is assumed, the CCC equals the ICC [34,36,37]. Although we report the CCC here, this measure does suffer from notable deficiencies common to many of these correlation measures [39][40][41] in that they are very sensitive to sample heterogeneity and that they are aggregate measures (thus making it difficult to separate systematic bias from issues in precision or large random errors). Thus, it would not be valid to compare our CCC measures to those measured on a different set of nodules with a different range of volumes.…”
Section: ] Precisionmentioning
confidence: 97%
“…Table 2 summarizes the metrics used in this study. For more details, readers are referred to a series of papers published on statistical methods for quantitative imaging biomarkers [31][32][33]41].…”
Section: Spatial Overlapmentioning
confidence: 99%
“…More detail is provided by Raunig et al (4). Investigators often want to compare the technical performance of two or more competing imaging procedures to assess the typical performance of the procedures, to identify the best procedure, to test the noninferiority of a procedure relative to a standard procedure, or to identify procedures that provide similar measurements (6). Table 7 summarizes some common research questions asked in QIB procedure comparison studies and possible study designs used with each.…”
Section: Metric Commentmentioning
confidence: 99%
“…Comparison of intraclass correlation coefficients estimated from groups of subjects sampled from different populations can be misleading because intraclass correlation coefficients are scaled relative to the subjects in the study sample; thus, comparisons based on different populations can be invalid (5,6). Within-subject coefficient of variance The within-subject coefficient of variance is the standard deviation of the replicate measures (within-subject standard deviation) divided by the mean.…”
Section: Intraclass Correlation Coefficientmentioning
confidence: 99%
See 1 more Smart Citation