“…However, they do not integrate their data in a framework that allows scalable and detailed querying (e.g., quickly extracting all water table and temperature data from multiple sites into a single table, for scaling of water table-temperature relationships from individual sites to a broader geographical range). The field of bioinformatics is further along in this regard: for molecular meta-omic data, numerous databases (e.g., MIGS/MIMS, MIMAS, IMG/M, GeneLab) (Hermida et al, 2006;Field et al, 2008;Gattiker et al, 2009;Chen et al, 2019;Ray et al, 2019) and integrative data management platforms (e.g., KBase, MOD-CO, ODG, GeNNet, BioKNO, MGV, OMMS, mixOmics) (Sujansky, 2001;Symons & Nieselt, 2011;Perez-Arriaga et al, 2015;Yoon, Kim & Kim, 2017;Costa et al, 2017;Rohart et al, 2017;Guhlin et al, 2017;Manzoni et al, 2018;Arkin et al, 2018;Brandizi et al, 2018;Rambold et al, 2019) have been developed, and often include standardization of sample metadata to enable efficient data integration. Notable among these are KBase (https://kbase.us/) (Arkin et al, 2018), which provides "apps" through which users can process their data in a framework that tracks processing steps ("provenance") in an accessible format, and MOD-CO (Rambold et al, 2019), a bioinformatics data processing tool that includes a conceptual schema and data model to track metadata and workflows.…”