Consistent, repeatable tumour volume measurements are key to accurately calculating tumour growth and successful assessment of therapeutic efficacy. Our previous work showed that a novel 3D and thermal imaging system for measuring subcutaneous rodent tumours (BioVolume) significantly reduced inter-operator variability across three in vivo efficacy studies. Here we continue to investigate this reduction in inter-operator variability across a much larger dataset. A dataset of 5,593 inter-operator repeats and 5,073 corresponding calliper repeats was obtained from tumour scans and measurements in 22 laboratories across 238 studies with 112 users, 23 animal strains and 98 unique cell lines. The inter-operator variability of the two measurement methods was analysed using coefficient of variation (CoV), intra-class correlation (ICC) analysis, and significance testing. The 3D and thermal imaging system produced a significantly lower median CoV of 0.173 compared to a median calliper CoV of 0.205 (P value = 5.2 x 10^-9). ICC analysis further confirmed the statistical significance of these values, allowing us to conclude that this novel 3D and thermal imaging system offers a significant reduction in inter-operator variability. This has the potential to improve reproducibility of in vivo studies across a wide range of animal strains and cell lines. The effects of using a device with large inter-operator variability at critical points in the study were also investigated. At randomisation, changing the operator performing measurements resulted in 59.4% probability that a rodent would be reassigned to a different group. For measurements carried out using the imaging system, the probability that changing the operator would also result in change of a rodent’s group was much lower at 29.2%. During studies where the tumour was expected to regress, substituting an operator mid-study resulted in a tumour volume increase of approximately 500mm^3 when callipers were used for measurement. For the imaging device, substituting users did not affect the tumour regression trend, potentially removing the need for the same operator to be present for the entire study duration. The effect of swapping an operator mid-study on the drug efficacy metric AUC (Area Under the Curve) was also observed; no statistical difference in AUC was observed for both BioVolume and callipers (overlapping 95% confidence intervals).
Repeatable tumor measurements are key to accurately assessing tumor growth and treatment efficacy. A preliminary study that we conducted showed that a novel 3D and thermal imaging system (3D-TI) for measuring subcutaneous tumors in rodents significantly reduced interoperator variability across 3 in vivo efficacy studies. Here we further studied this reduction in interoperator variability across a much larger dataset. A dataset consisting of 6,532 paired 3D-TI and caliper interoperator measurements was obtained from tumor scans and measurements in 27 laboratories across 289 studies, 153 operators, over 20 mouse strains, and 100 cell lines. Interoperator variability in both measurement methods was analyzed using coefficient of variation (CV), intraclass correlation (ICC) analysis, and significance testing. The median 3D-TI CV was significantly lower than the median caliper CV. The effects of large interoperator variability at critical points in the study were also investigated. At stratified randomization, changing the operator performing caliper measurements resulted in a 59% probability that a mouse would be reassigned to a different group. The probability that this would occur when using 3D-TI was significantly lower at 29%. In studies in which a tumor was expected to regress, changing the operator during the study was associated with a tumor volume increase of approximately 500mm3 when using calipers. This change did not occur when using 3D-TI. We conclude that 3D-TI significantly reduces interoperator variability as compared with calipers and can improve reproducibility of in vivo studies across a wide range of mouse strains and cell lines.
User measurement bias during subcutaneous tumor measurement is a source of variation in preclinical in vivo studies. We investigated whether this user variability could impact efficacy study outcomes, in the form of the false negative result rate when comparing treated and control groups.Two tumor measurement methods were compared; calipers which rely on manual measurement, and an automatic 3D and thermal imaging device. Tumor growth curve data were used to create an in silico efficacy study with control and treated groups. Before applying user variability, treatment group tumor volumes were statistically different to the control group. Utilizing data collected from 15 different users across 9 in vivo studies, user measurement variability was computed for both methods and simulation was used to investigate its impact on the in silico study outcome.User variability produced a false negative result in 3.5% to 19.5% of simulated studies when using calipers, depending on treatment efficacy. When using an imaging device with lower user variability this was reduced to 0.0% to 2.4%, demonstrating that user variability impacts study outcomes and the ability to detect treatment effect.Reducing variability in efficacy studies can increase confidence in efficacy study outcomes without altering group sizes. By using a measurement device with lower user variability, the chance of missing a therapeutic effect can be reduced and time and resources spent pursuing false results could be saved. This improvement in data quality is of particular interest in discovery and dosing studies, where being able to detect small differences between groups is crucial.
User measurement bias during subcutaneous tumor measurement is a source of variation in preclinical in vivo studies. We investigated whether this user variability could impact efficacy study outcomes, in the form of the false negative result rate when comparing treated and control groups. Two tumor measurement methods were compared; calipers which rely on manual measurement, and an automatic 3D and thermal imaging device. Tumor growth curve data were used to create an in silico efficacy study with control and treated groups. Before applying user variability, treatment group tumor volumes were statistically different to the control group. Utilizing data collected from 15 different users across 9 in vivo studies, user measurement variability was computed for both methods and simulation was used to investigate its impact on the in silico study outcome. User variability produced a false negative result in 0.7% to 18.5% of simulated studies when using calipers, depending on treatment efficacy. When using an imaging device with lower user variability this was reduced to 0.0% to 2.6%, demonstrating that user variability impacts study outcomes and the ability to detect treatment effect. Reducing variability in efficacy studies can increase confidence in efficacy study outcomes without altering group sizes. By using a measurement device with lower user variability, the chance of missing a therapeutic effect can be reduced and time and resources spent pursuing false results could be saved. This improvement in data quality is of particular interest in discovery and dosing studies, where being able to detect small differences between groups is crucial.
Tumour volume is typically calculated using only length and width measurements, using width as a proxy for height in a 1:1 ratio. When tracking tumour growth over time, important morphological information and measurement accuracy is lost by ignoring height, which we show is a unique variable. Lengths, widths, and heights of 9522 subcutaneous tumours in mice were measured using 3D and thermal imaging. The average height:width ratio was found to be 1:3 proving that using width as a proxy for height overestimates tumour volume. Comparing volumes calculated with and without tumour height to the true volumes of excised tumours indeed showed that using the volume formula including height produced volumes 36X more accurate (based off of percentage difference). Monitoring the height:width relationship (prominence) across tumour growth curves indicated that prominence varied, and that height could change independent of width. Twelve cell lines were investigated individually; the scale of tumour prominence was cell line-dependent with relatively less prominent tumours (MC38, BL2, LL/2) and more prominent tumours (RENCA, HCT116) detected. Prominence trends across the growth cycle were also dependent on cell line; prominence was correlated with tumour growth in some cell lines (4T1, CT26, LNCaP), but not others (MC38, TC-1, LL/2). When pooled, invasive cell lines produced tumours that were significantly less prominent at volumes >1200 mm3 compared to non-invasive cell lines ( P < .001). Modelling was used to show the impact of the increased accuracy gained by including height in volume calculations on several efficacy study outcomes. Variations in measurement accuracy contribute to experimental variation and irreproducibility of data, therefore we strongly advise researchers to measure height to improve accuracy in tumour studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.