“…By prioritizing these perturbations at each iteration, we efficiently sample the space of possible sequences for informative examples, and thereby train accurate machine learning models with less data [40][41][42] . Active learning has been successfully applied to model metabolic networks 43 , optimize cell culture media 44 , perform in silico drug screens [45][46][47][48] , improve text and image classifiers 42,49 , discover energy-efficient materials 50,51 , identify TFs that drive cellular differentiation 52 , and select optimal training data for nanopore base calling 53 . However, active learning has not yet been applied to train models of cis-regulatory grammars.…”