“…For the issue of mean-variance in MDPs, there have been a lot of references; see, [4,19,25,28] for the finite horizon reward variance; [6,10,12,20,28,30] for the infinite horizon discounted reward variance; [11,24,30] for the first passage variance; DOI: 10.14736/kyb-2017- and [2,6,8,9,13,14,21,27,29,32] for the limiting average variance. To the best of our knowledge, most of the aforementioned works in MDPs focus on solving mean-variance problems in discrete-time MDPs (DTMDPs) [3,4,6,13,14,21,24,25,28,30,32] as well as in continuous-time MDPs (CTMDPs) [8,9,10,11,12,20,27], nevertheless, only a few works address mean-variance problems in semi-Markov decision processes (SMDPs); see [2,28] for finite SMDPs and [19] with a finite time horizon.…”