Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control

Wan, Runzhe; Zhang, Xinyu; Song, Rui

doi:10.48550/arxiv.2009.04607

Cited by 3 publications

(3 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Song et al [14] studied how to suppress the disease spread by controlling the inter-regional mobility. The algorithm is called the dual-objective reinforcement-learning epidemic control agent (DURLECA), which adopts a GNN to capture the graph feature and uses reinforcement learning to decide on the mobility restriction between regions.…”

Section: Macro-based Control Methodsmentioning

confidence: 99%

Individual Behavior Modeling and Transmission Control During Disease Spread: A Review

Dong

Zhao

2022

Int. J. Crowd Sci.

View full text Add to dashboard Cite

In this paper, we provide a detailed review of two categories of the literature: the spontaneous protective behaviors of individuals during disease spread and the mandatory measures to control the disease spread. In the literature, the models of individual protective behaviors can be divided into two parts: the environment-induced protective behaviors and the information-induced protective behaviors. And the mandatory measures of disease control can be divided into two parts: the macro-based control methods and the micro-based control methods. We provide a detailed review to the various categories of research. Then we compare the effects of different control methods through simulation. Among the micro-based control methods, the method based on minimizing the largest eigenvalue has the best effect. This review is of crucial importance to summarize the studies of the spontaneous protective behaviors during disease spread and the mandatory measures to control the disease spread.

show abstract

Section: Macro-based Control Methodsmentioning

confidence: 99%

Individual Behavior Modeling and Transmission Control During Disease Spread: A Review

Dong

Zhao

2022

Int. J. Crowd Sci.

View full text Add to dashboard Cite

show abstract

“…As another example, the coronavirus disease 2019 (COVID-19) has been one of the worst global pandemics in history affecting millions of people. There is a growing interest in ap-plying RL to develop data-driven intervention policies to contain the spread of the virus (see e.g., Eftekhari et al, 2020;Kompella et al, 2020;Wan et al, 2020). However, the spread of COVID-19 is an extremely complex process and is nonstationary over time.…”

Section: Introductionmentioning

confidence: 99%

Reinforcement Learning in Possibly Nonstationary Environments

Li¹,

Shi²,

Wu³

et al. 2022

Preprint

View full text Add to dashboard Cite

We consider reinforcement learning (RL) methods in offline nonstationary environments. Many existing RL algorithms in the literature rely on the stationarity assumption that requires the system transition and the reward function to be constant over time. However, the stationarity assumption is restrictive in practice and is likely to be violated in a number of applications, including traffic signal control, robotics and mobile health. In this paper, we develop a consistent procedure to test the nonstationarity of the optimal policy based on pre-collected historical data, without additional online data collection. Based on the proposed test, we further develop a sequential change point detection method that can be naturally coupled with existing state-of-the-art RL methods for policy optimisation in nonstationary environments. The usefulness of our method is illustrated by theoretical results, simulation studies, and a real data example from the 2018 Intern Health Study 1 . A Python implementation of the proposed procedure is available at https://github.com/limengbinggz/CUSUM-RL.

show abstract

“…Many real world problems such as radio resource management (Giupponi et al, 2005), infectious disease control (Wan et al, 2020), energy-balancing in sensor networks (Hribar et al, 2022), etc., can be formulated as a multi-objective optimization problem. Whenever an agent is tackling such a problem in a dynamic environment, a single objective Reinforcement Learning (RL) methods such as Q-learning will not result in a behaviour that will be optimal for all objectives.…”

Section: Introductionmentioning

confidence: 99%

Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement Learning

Hribar¹,

Luke²,

Dusparić³

2022

Preprint

View full text Add to dashboard Cite

In this paper, we build on advances introduced by the Deep Q-Networks (DQN) approach to extend the multiobjective tabular Reinforcement Learning (RL) algorithm W-learning to large state spaces. W-learning algorithm can naturally solve the competition between multiple single policies in multi-objective environments. However, the tabular version does not scale well to environments with large state spaces. To address this issue, we replace underlying Q-tables with DQN, and propose an addition of W-Networks, as a replacement for tabular weights (W) representations. We evaluate the resulting Deep W-Networks (DWN) approach in two widely-accepted multi-objective RL benchmarks: deep sea treasure and multi-objective mountain car. We show that DWN solves the competition between multiple policies while outperforming the baseline in the form of a DQN solution. Additionally, we demonstrate that the proposed algorithm can find the Pareto front in both tested environments.

show abstract

Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control

Cited by 3 publications

References 37 publications

Individual Behavior Modeling and Transmission Control During Disease Spread: A Review

Individual Behavior Modeling and Transmission Control During Disease Spread: A Review

Reinforcement Learning in Possibly Nonstationary Environments

Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement Learning

Contact Info

Product

Resources

About