“…DP provides optimal policies through the resolution of a partial differential equation, whereas the necessary conditions for optimality offered by the PMP allow to set up a two-point boundary value problem that, if solved, returns candidate locally optimal solutions. Both methods lead to analytical solutions only in very few cases, and complex hindrances may quickly appear in the numerical context (the stochastic setting is even more vicious than the deterministic one, the latter being better understood for a quite wide range of problems, see, e.g., [37,11]). This has fostered the investigation of more tractable approaches to solve NLPs, such as Monte Carlo simulation [34,16], Markov chain discretization [19,20] and deterministic (though non-equivalent) reformulation [2,4], among others.…”