“…Two new manifold-based algorithms that combine episodic exploration and importance sampling were proposed to efficiently learn a manifold in the policy parameter space such that its image in the objective space accurately approximates the Pareto frontier (Parisi, Pirotta, & Peters, 2017). There are also numerous other gradient-based methods to solve multi-objective optimization (Pirotta, Parisi, & Restelli, 2015;Parisi, Pirotta, & Restelli, 2016;Parisi, et al, 2014aParisi, et al, , 2014bPinder, 2016). For example, policy gradient techniques were developed to approximate the Pareto frontier in multi-objective Markov decision processes (Pirotta et al, 2015;Parisi et al, 2016Parisi et al, , 2014aParisi et al, , 2014b.…”