Reinforcement learning (RL) can obtain the supervisory controller for discrete-event systems modeled by finite automata and temporal logic. The published methods often have two limitations. First, a large number of training data are required to learn the RL controller. Second, the RL algorithms do not consider uncontrollable events, which are essential for supervisory control theory (SCT). To address the limitations, we first apply SCT to find the supervisors for the specifications modeled by automata. These supervisors remove illegal training data violating these specifications and hence reduce the exploration space of the RL algorithm. For the remaining specifications modeled by temporal logic, the RL algorithm is applied to search for the optimal control decision within the confined exploration space. Uncontrollable events are considered by the RL algorithm as uncertainties in the plant model. The proposed method can obtain a nonblocking supervisor for all specifications with less learning time than the published methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.