It suffices to prove the upper bound, since the minimax lower bound was already shown by Han et al. (2019) for isotonic regression without unknown permutations. We also focus on the case d ≥ 3 since the result is already available for d = 2 (Shah et al., 2017;Mao et al., 2020). Our proof proceeds in two parts. First, we show that the bounded least squares estimator over isotonic tensors (without unknown permutations) enjoys the claimed risk bound. We then use our proof of this result to prove the upper bound (7). The proof of the first result is also useful in establishing part (b) of Proposition 2.Bounded LSE over isotonic tensors. For a tensor A ∈ R d,n and x 1 , . . . , x d−2 ∈ [n 1 ], let A x1,...,xd−2 denote the matrix formed by fixing the first d − 2 dimensions (variables) of A to x 1 , . . . , x d−2 , i.e., entry (i, j) of this matrix is given by A(x 1 , . . . , x d−2 , i, j). Recall that M(L 2,n1,n1 ) denotes the set of all bivariate isotonic n 1 × n 1 matrices.For convenience, let M(L d,n | r) and M(L 2,n1,n1 | r) denote the intersection of the respective sets with the ∞ ball of radius r. Letting A − B denote the Minkowski difference between the sets A and B, defineWe observe that there is a bijection between M full (r) and theNote that through this notation, we are indexing the components of this Cartesian product by the elements of L d−2,n1,...,n1 , so that a generic element A of this set has (x 1 , . . . ,Note that by construction, we have ensured, for each r ≥ 0, the inclusions ∆∈M full (1)−M full (1) ∆ 2≤ ∆ 2 , ∆ . For convenience, define for each t ≥ 0 the random variable ξ(t) : = sup ∆∈M full (1)−M full (1) ∆ 2≤t, ∆ .