“…Which algorithm is the best in practice seems not to have a simple answer and there are instances where a class of algorithms outperforms the other and vice-versa [26]. Most of the theoretical literature on momentum-based methods concerns convex problems [18,23,24,30,49] and, despite these methods have been successfully applied to a variety of problems, only recently high dimensional non-convex settings have been considered [22,51,52]. Furthermore, with few exceptions [45], the majority of these studies focus on worst-case analysis while empirically one could also be interested in the behaviour of such algorithms on typical instances of the optimization problem, when this is extracted from a probability distribution.…”