Abstract. Openly accessible global-scale surface water chemistry
datasets are urgently needed to detect widespread trends and problems, to
help identify their possible solutions, and to determine critical spatial data
gaps where more monitoring is required. Existing datasets are limited with respect to
availability, sample size and/or sampling frequency, and geographic scope. These
limitations inhibit researchers from tackling emerging transboundary water chemistry
issues – for example, the detection and understanding of delayed recovery
from freshwater acidification. Here, we begin to address these limitations
by compiling the global Surface Water Chemistry (SWatCh) database, available
on Zenodo (https://doi.org/10.5281/zenodo.6484939; Rotteveel and Heubach, 2021). We collect, clean, standardize, and
aggregate open-access data provided by six national and international
programs and research groups (United Nations Environment Programme; Hartmann
et al., 2019; Environment and Climate Change Canada; the United States of
America National Water Quality Monitoring Council; the European Environment
Agency; and the United States National Science Foundation McMurdo Dry
Valleys Long-Term Ecological Research Network) in order to compile a database
containing information on sites, methods, and samples, and a geospatial
information system (GIS) shapefile
of site locations. We remove poor-quality data (e.g., values flagged
as “suspect” or “rejected”), standardize variable naming conventions and
units, and perform other data cleaning steps required for statistical
analysis. The database contains water chemistry data for streams, rivers,
canals, ponds, lakes, and reservoirs across seven continents, 24 variables,
33 722 sites, and over 5 million samples collected between 1960 and 2022.
Similar to prior research, we identify critical spatial data gaps on the
African and Asian continents, highlighting the need for more data collection
and sharing initiatives in these areas, especially considering that freshwater
ecosystems in these environs are predicted to be among the most heavily
impacted by climate change. We identify the main challenges associated with
compiling global databases – limited data availability, dissimilar sample
collection and analysis methodology, and reporting ambiguity – and provide
recommended solutions. By addressing these challenges and consolidating data
from various sources into one standardized, openly available, high-quality,
and transboundary database, SWatCh allows users to conduct powerful and
robust statistical analyses of global surface water chemistry.