“…GOLS-I has also been demonstrated to outperform probabilistic line searches (Mahsereci and Hennig, 2017), provided mini-batch sizes are not too small (< 50 for investigated problems) (Kafka and Wilke, 2019). The gradient-only optimization paradigm has recently also shown promise in the construction of approximation models to conduct line searches (Chae and Wilke, 2019). Some of the most important factors governing the nature of the computed gradients are: 1) The neural network architecture, 2) the activation functions (AFs) used within the architecture, 3) the loss function implemented, and 4) the mini-batch size used to evaluate the loss, to name a few.…”