“…Several years later the first MILP formulations were proposed for the full problem (Bertsimas and Dunn 2017) and (Verwer and Zhang 2017). The latest methods improve these works using non-crisp decision boundaries (Rhuggenaath et al 2018), a binary encoding (Verwer and Zhang 2019), new analytical bounds and an improved tree representation translation (Hu, Rudin, and Seltzer 2019), by translating to CP (Verhaeghe et al 2020), using dynamic programming with search (Demirović et al 2020), by caching branch-andbound (Aglin, Nijssen, and Schaus 2020), and optimized randomization (Blanquero et al 2021). In this work, we build on these works to create the first formulation for optimal learning of robust decision trees.…”