The importance of water conditions on the bioavailability and toxicity of metals has long been recognized. In the United States and elsewhere, regulatory criteria to protect aquatic life have likewise acknowledged factors that modify bioavailability, and incorporated water hardness-toxicity regressions into criteria for several metals. Biotic ligand models (BLMs) became a dominant paradigm in predicting the aquatic toxicology of metals as mechanistic functions of modeled metals speciation and cation competition for binding to ionoregulatory sites on the gill or biotic ligand surface, disrupting the internal mineral balance of organisms and leading to stress or death. Following development of software that efficiently executed BLM calculations, in 2007 the U.S. Environmental Protection Agency (USEPA) published national recommended aquatic life criteria for copper (Cu) that was based upon the BLM construct and software. However, there has not been widespread adoption by states of the USEPA’s recommended BLM-based aquatic life criteria into their water quality standards. Reasons for the languishing of the BLM-based Cu criteria presumably include that the BLM-based Cu criteria are far more complex than any previous aquatic life criteria and that the BLM requires much more data than water hardness to calculate the criteria. pH, dissolved organic carbon (DOC), a suite of major ions, alkalinity, and specialized software are required to calculate BLM-based criteria. The purpose of the present study is to compare the leading available aquatic life criteria models for predicting Cu and zinc (Zn) toxicity (or lack of toxicity) to diverse freshwater species and ecosystems: 1) the single linear regression hardness models; 2) Multiple Linear Regression (MLR) models that predict toxicity as a function of DOC, hardness, and pH, and 3) BLM-based criteria. The scope of the review for Zn was truncated relative to that of Cu because no generally applicable MLR model for Zn was available at the time of writing.The comparisons included evaluating the performance of the models for predicting Cu toxicity and the models’ behavior in natural waters. The evaluations of toxicity predictions were limited to freshwater animals and included testing model predictions against literature reports of observed acute and chronic responses with trout, other fish, mussels and other invertebrates, and olfactory impairment or avoidance responses in salmonids. The presumed “safe” criteria concentrations were also compared against effects attributed to Cu in field studies or ecosystem experiments. The comparisons emphasize natural waters in California, however conditions in California waters are diverse and the comparisons are broadly relevant to other freshwaters.The results were consistent with previous work that found that the MLR toxicity predictions were generally more reliable than the BLM for predicting Cu toxicity across the vast majority of natural water types, but that the BLM predictions were more reliable in situations with low or high pH (less than about 5.7 or greater than about 9) or in waters with unusual ionic composition. However, both the MLR and BLM performed well with many datasets and aquatic life criteria calculation procedures both produced criteria concentrations that mostly appeared safe in comparisons with field and ecosystem studies and with particularly sensitive species and endpoints such as chemoreception and avoidance behaviors in salmonids. In marked contrast, the hardness-regression model toxicity predictions performed poorly with all datasets tested, with little correspondences between predicted and observed effects. The comparisons of the hardness-criteria concentrations with Cu effects thresholds estimated from field and ecosystem studies showed that the hardness-based criteria failed to reliably produce protective concentrations. In soft waters with less than about 40 mg/L hardness, the hardness-based criteria would be protective but not at higher hardnesses. In natural waters, the hardness criteria had little correlation to the MLR, which was the most accurate proxy for truly toxic or nontoxic conditions, and in some steams, the hardness-based criteria actually had negative correlations with the MLR criteria.Th performance and protectiveness of the Cu MLR and BLM models and criteria were largely similar. However, the Cu MLR has advantages for adoption into water quality standards over the BLM because of its reduced data requirements, simpler form, ease of calculation, transparency, and non-dependence on specialized computer software. The Cu MLR could be structured in tiers to accommodate missing data by setting conservative default values for pH or DOC, for example, pH 7 and 1 mg/L DOC. Then, only if environmental Cu concentrations exceeded criteria as calculated with default values would actual DOC and pH data be needed.