“…Risk probability optimality problems for Markov decision processes are first divided into three groups that are based on the hold times of the system state: discrete-time Markov decision processes (DTMDPs) [2,26,28,29,30,31], semi-Markov decision processes (SMDPs) [10,11,12,13,25], and continuous-time Markov decision processes DOI: 10.14736/kyb-2021-2-0272 (CTMDPs) [14,15,16]. Then the second classification is grouped by the risk probability optimization problems with the reward case or the loss case.…”