“…Biological sequence design has been approached with a wide variety of methods: reinforcement learning (Angermueller et al, 2019), Bayesian optimization (Wilson et al, 2017;Belanger et al, 2019;Moss et al, 2020;Pyzer-Knapp, 2018;Terayama et al, 2021), search/sampling using deep generative models (Brookes et al, 2019a;Kumar & Levine, 2020;Das et al, 2021;Hoffman et al, 2021;Melnyk et al, 2021), deep model-based optimization (Trabucco et al, 2021a), adaptive evolutionary methods (Hansen, 2006;Swersky et al, 2020;Sinai et al, 2020), likelihood-free inference (Zhang et al, 2021), and black-box optimization with surrogate models (Dadkhahi et al, 2021). As suggested in Section 3, GFlowNets have the potential to improve over such methods by amortizing the cost of search (e.g., when comparing with MCMC's mixing time) over learning, giving probability mass to the entire space facilitating exploration and diversity (vs e.g., RL which tends to be greedier), enabling the use of imperfect data (vs e.g., generative models that require strictly positive or negative samples), and by scaling well with data by exploiting structure in function approximation (vs e.g., Bayesian methods that can cost O(n 3 ) for n datapoints).…”