“…Therefore, sweeping λ, trying tens of different values at least, is the only way to generate models with the desired cost and accuracy characteristics. Recently, UDC [22], LightNAS [20] and our previous work of [23] added an explicit cost target T in the regularization term, and let the optimizer reduce the difference between the estimated cost and the target, by adding a term proportional to |R(θ) − T |, or equivalent, to the loss function.…”