Diagnostic test accuracy, based on sensitivity, specificity, positive/negative predictive values (dichotomous case), and on ROC analysis (continuous case), should be expressed with a single, coherent index. We propose to modelize the diagnostic test as a flow of information between the disease, that is a hidden state of the patient, and the physicians. We assume that: i) sensitivity, specificity, false positive/negative rates are the probabilities of a Binary Asymmetric Channel ; ii) the diagnostic channel information is measured by Mutual Information. We introduce two summary measures of accuracy, namely the Information Ratio (IR) for the dichotomous case, and the Global Information Ratio (GIR) for the continuous case. We apply our model to a study by Pisano et al. [19], who compared digital versus film mammography, in diagnosing breast cancer in a screening population of 42,760 women. In film mammography, the maximum IR (0.178) corresponds to the standard cut-off of sensitivity and specificity provided by the ROC analysis (GIR 0.200). Maximum IR and GIR for digital mammography are higher (0.201 and 0.229, respectively), but IR corresponds to a cut-off with higher sensitivity but lower specificity, thus suggesting that larger information provided by digital mammography carries the risk of more false positive cases.
Agreement measures are useful tools to both compare different evaluations of the same diagnostic outcomes and validate new rating systems or devices. Cohen’s kappa (κ) certainly is the most popular agreement method between two raters, and proved its effectiveness in the last sixty years. In spite of that, this method suffers from some alleged issues, which have been highlighted since the 1970s; moreover, its value is strongly dependent on the prevalence of the disease in the considered sample. This work introduces a new agreement index, the informational agreement (IA), which seems to avoid some of Cohen’s kappa’s flaws, and separates the contribution of the prevalence from the nucleus of agreement. These goals are achieved by modelling the agreement—in both dichotomous and multivalue ordered-categorical cases—as the information shared between two raters through the virtual diagnostic channel connecting them: the more information exchanged between the raters, the higher their agreement. In order to test its fair behaviour and the effectiveness of the method, IA has been tested on some cases known to be problematic for κ, in the machine learning context and in a clinical scenario to compare ultrasound (US) and automated breast volume scanner (ABVS) in the setting of breast cancer imaging.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.