The chemistry of erupted clinopyroxene crystals (±equilibrium liquids) have been widely used to deduce the pressures and temperatures of magma storage in volcanic arcs. However, the wide variety of different equations parametrizing the relationship between mineral and melt compositions and intensive variables such as pressure and temperature yield vastly different results, with implications for our interpretation of magma storage conditions. We use a new test dataset of N=505 Cpx-Liq pairs from variably-hydrous experiments at crustal conditions (0-13 kbar) to assess the performance of different thermobarometers, and identify the most accurate and precise expressions for application to subduction zone magmas. First, we assess different equilibrium tests, finding that comparing the measured and predicted EnFs and KD (using Fet in both phases) are the most useful tests in arc magmas, while CaTs, CaTi and Jd tests have limited utility. We then apply further quality filters based on cation sums (3.95-4.05), number of analyses (N>5), and the presence of reported H2O data in the liquid to obtain a filtered dataset (N=194). We use this filtered dataset to compare calculated versus experimental pressures and temperatures for different combinations of thermobarometers. A number of Cpx-Liq thermometers perform very well when liquid H2O contents are known, although the Cpx composition contributes relatively little to the calculated temperature. Most Cpx-only thermometers perform very badly, greatly overestimating temperatures for hydrous experiments. Cpx-Liq and Cpx-only barometers show similar performance to one another, all showing low precision and systematic offsets (overestimating pressure for low P experiments, and underestimating pressure for High P expressions). We also assess the sensitivity of different equations to melt H2O contents, which are poorly constrained in many natural systems. Overall, this work demonstrates that substantial work is needed to obtain precise and accurate estimates of magma storage depths from Cpx±Liq equilibrium in volcanic arcs. At present, Cpx-based barometry only provides sufficient resolution to distinguish broad storage regions (e.g., upper, mid, lower crust), rather than ability to precisely and accurately locate magma reservoirs to compare to geophysical records.