Visual working memory is highly limited, and its capacity is tied to many indices of cognitive function. For this reason, there is much interest in understanding its architecture and the sources of its limited capacity. As part of this research effort, researchers often attempt to decompose visual working memory errors into different kinds of errors, with different origins. One of the most common kinds of memory error is referred to as a “swap,” where people report a value that closely resembles an item that was not probed (e.g., an incorrect, non-target item). This is typically assumed to reflect confusions, like location binding errors, which result in the wrong item being reported. Capturing swap rates reliably and validly is of great importance because it permits researchers to accurately decompose different sources of memory errors and elucidate the processes that give rise to them. Here, we ask whether different visual working memory models yield robust and consistent estimates of swap rates. This is a major gap in the literature because in both empirical and modeling work, researchers measure swaps without motivating their choice of swap model. Therefore, we use extensive parameter recovery simulations with three mainstream swap models to demonstrate how the choice of measurement model can result in very large differences in estimated swap rates. We find that these choices can have major implications for how swap rates are estimated to change across conditions. In particular, each of the three models we consider can lead to differential quantitative and qualitative interpretations of the data. Our work serves as a cautionary note to researchers as well as a guide for model-based measurement of visual working memory processes.