We propose a new coefficient, the adjusted Wallace coefficient (AW), and corresponding confidence intervals (CI) as quantitative measures of congruence between typing methods. The performance of the derived CI was evaluated using simulated data. Published microbial typing data were used to demonstrate the advantages of AW over the Wallace coefficient.Several molecular epidemiology studies of clinically relevant microorganisms provide a characterization of isolates based on different typing methods (3, 5, 7). The informed choice of which typing method is more appropriate in a given clinical or microbiological research setting lies in the ability of the method to identify isolates of interest, the execution time, the cost-effectiveness, and the ease of interpretation of the results (16). Nevertheless, to support the decision, a quantitative comparison of the results of the typing methods should also be performed (3).Carriço et al. (3) proposed the use of the adjusted Rand coefficient (AR) and the Wallace coefficient (W) as measures to assess the congruence of typing methods. These have been applied in several studies comparing or proposing new typing methods (2,3,7,9,12,17). AR provides a measure of the overall agreement between two typing methods and corrects the previously used coefficient of typing concordance (18) for chance agreement, avoiding the overestimation of concordance between typing methods (8). W provides information about the directional agreement between typing methods. W A3B is the probability that, for a given data set, two individuals are classified together using method B if they have been classified together using method A. In spite of its simple interpretation, one can obtain high values of W due to chance alone. For instance, if method A creates a high number of partitions (such as pulsed-field gel electrophoresis [PFGE] subtypes) and method B creates only two (such as the presence or absence of a given gene), W A3B will be high but may not be different from the value expected by chance alone.The expected Wallace coefficient under independence (W i ) was previously proposed to evaluate whether the results of two typing methods could agree by chance alone (11). To assess whether the estimated W value is significantly different from the W i value, one can use the proposed Wallace 95% confidence interval (CI) (11,13). If the value of W i is within the CI of W, the null hypothesis of independence between classifications cannot be rejected with the respective confidence level (11). One way to directly take into account W i would be to calculate an adjusted version of W.Albatineh et al. (1) had previously discussed the correction for chance agreement for several similarity indices, including W. Although this correction was never applied in the context of microbial typing studies, others have previously acknowledged the importance and usefulness of such a correction (15).Derivation of AW. The adjusted Wallace coefficient (AW) is derived by following an approach similar to that used for AR (8):The ...
Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measures, including pairwise agreement measures. Several measures have been proposed and the importance of obtaining confidence intervals for the point estimate in the comparison of these measures has been highlighted. A broad range of methods can be used for the estimation of confidence intervals. However, evidence is lacking about what are the appropriate methods for the calculation of confidence intervals for most clustering agreement measures. Here we evaluate the resampling techniques of bootstrap and jackknife for the calculation of the confidence intervals for clustering agreement measures. Contrary to what has been shown for some statistics, simulations showed that the jackknife performs better than the bootstrap at accurately estimating confidence intervals for pairwise agreement measures, especially when the agreement between partitions is low. The coverage of the jackknife confidence interval is robust to changes in cluster number and cluster size distribution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.