“…Apart from building transport maps using TT decompositions, most of other methods approximate the transport map T by solving an optimisation problem such that T minimises some statistical divergence between the target ν π and the pushforward T μ. The mapping T often has a triangular structure, which is computationally efficient for evaluating the Jacobian and the inverse of T and can be represented using polynomials [4,43,50,51], kernel functions [15,37], invertible neural networks [7,9,10,35,49,52], etc. In this setting, the objective function has to be approximated using a Monte Carlo average and minimised by some (stochastic) gradient-based method.…”