We assessed the performance of ambient ozone (O 3 ) and carbon dioxide (CO 2 ) sensor field calibration techniques when they were generated using data from one location and then applied to data collected at a new location. This was motivated by a previous study (Casey et al., 2018), which highlighted the importance of determining the extent to which field calibration regression models could be aided by relationships among atmospheric trace gases at a given training location, which may not hold if a model is applied to data collected in a new location. We also explored the sensitivity of these methods in response to the timing of field calibrations relative to deployment periods. Employing data from a number of field deployments in Colorado and New Mexico that spanned several years, we tested and compared the performance of field-calibrated sensors using both linear models (LMs) and artificial neural networks (ANNs) for regression. Sampling sites covered urban and rural-peri-urban areas and environments influenced by oil and gas production. We found that the best-performing model inputs and model type depended on circumstances associated with individual case studies, such as differing characteristics of local dominant emissions sources, relative timing of model training and application, and the extent of extrapolation outside of parameter space encompassed by model training. In agreement with findings from our previous study that was focused on data from a single location (Casey et al., 2018), ANNs remained more effective than LMs for a number of these case studies but there were some exceptions. For CO 2 models, exceptions included case studies in which training data collection took place more than several months subsequent to the test data period. For O 3 models, exceptions included case studies in which the characteristics of dominant local emis-sions sources (oil and gas vs. urban) were significantly different at model training and testing locations. Among models that were tailored to case studies on an individual basis, O 3 ANNs performed better than O 3 LMs in six out of seven case studies, while CO 2 ANNs performed better than CO 2 LMs in three out of five case studies. The performance of O 3 models tended to be more sensitive to deployment location than to extrapolation in time, while the performance of CO 2 models tended to be more sensitive to extrapolation in time than to deployment location. The performance of O 3 ANN models benefited from the inclusion of several secondary metaloxide-type sensors as inputs in five of seven case studies.Published by Copernicus Publications on behalf of the European Geosciences Union.