“…Metagenomes representing diverse environments such as petroleum reservoirs ( Hu et al, 2016 ; Nie et al, 2016 ; Liu et al, 2018 ; Christman et al, 2020 ), oil spill experimental microcosms ( Tan et al, 2015 ; Dombrowski et al, 2016 ), marine systems ( Orellana et al, 2017 ; Tully et al, 2018 ; Dong et al, 2019 , 2020 ), host-associated microbiomes ( Feigelman et al, 2017 ; Herman et al, 2020 ; Avila-Magaña et al, 2021 ), and other environments ( Yao et al, 2017 ; Zorz et al, 2019 ), were downloaded either as unassembled raw data from the NCBI SRA or as predicted gene sequences from the JGI Genome Portal ( Supplementary Table 4 ). Raw reads from unassembled metagenomes were filtered using BBDuk 1 for a minimum quality of 15 and a minimum read length of 150 bp.…”