Facing the outbreak of COVID-19, shortage in medical resources becomes increasingly outstanding. Therefore, efficient strategies for medical resource allocation are urgently called for. However, conventional rule-based methods from public health experts have limited capability in dealing with the complex and dynamic pandemic spreading situation. Besides, model-based optimization methods such as dynamic programming (DP) fails to work since we cannot obtain a precise model in the real-world situation most of the time. On the other hand, model-free reinforcement learning (RL) is powerful for decision making, but three key challenges exist in solving this problem via RL: (1) complex situation and countless choices for decision making in the real world; (2) only imperfect information are available due to the latency of pandemic spreading; (3) limitations on conducting experiments in real-world since we cannot set pandemic outbreaks arbitrarily. In this paper, we propose a hierarchical reinforcement learning framework with several specially designed components. We design a decomposed action space with a corresponding training algorithm to deal with the countless choices and ensure efficient and real time strategies. We design an recurrent neural network based framework to utilize the imperfect information obtained from the environment. We also design a multi-agents voting method, which modifies the decision making process considering the randomness during model training and thus improves the performance. We build a pandemic spreading simulator based on real world data, serving as the experimental platform. We conduct extensive experiments and the results show that our method outperforms all the baselines, which reduces infections and deaths by 14.25% on average without the multi-agents voting method and up to 15.44% with it.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.