“…In the second category, a low-complexity policy is instead obtained directly. Here, notable methods include policy distillation [30], VC-dimension constraints [16], concise finitestate machine plans [23], [24], low-memory policies through sparsity constraints [7], and information-theoretic approaches such as KL-regularisation [27], [35], mutual information regularisation with variations [33], [12], [34], and minimal specification complexity [11], [10]. Our work belongs to this second category and resembles [23], [24], [11], [10] the most, but differ since we consider Kolmogorov complexity.…”