“…We obtain insights from the nonparametric setting where the conditional probability (of taking a positive label) is assumed to be a smooth function (Tsybakov, 2004;Audibert and Tsybakov, 2007). Previous nonparametric active learning algorithms proceed by partitioning the action space into exponentially many sub-regions (e.g., partitioning the unit cube [0, 1] d into ε −d sub-cubes each with volume ε d ), and then conducting local mean (or some higher-order statistics) estimation within each sub-region (Castro and Nowak, 2008;Minsker, 2012;Locatelli et al, 2017Locatelli et al, , 2018Shekhar et al, 2021;Kpotufe et al, 2021). We show that, with an appropriately chosen set of neural networks that globally approximates the smooth regression function, one can in fact recover the minimax label complexity for active learning, up to disagreement coefficient (Hanneke, 2007(Hanneke, , 2014 and other logarithmic factors.…”