BackgroundPreparation to address the critical gap in a future pandemic between non-pharmacological measures and the deployment of new drugs/vaccines requires addressing two factors: 1) finding virus/pathogen-agnostic pathophysiological targets to mitigate disease severity and 2) finding a more rational approach to repurposing existing drugs. It is increasingly recognized that acute viral disease severity is heavily driven by the immune response to the infection (“cytokine storm” or “cytokine release syndrome”). There exist numerous clinically available biologics that suppress various pro-inflammatory cytokines/mediators, but it is extremely difficult to identify clinically effective treatment regimens with these agents. We propose that this is a complex control problem that resists standard methods of developing treatment regimens and accomplishing this goal requires the application of simulation-based, model-free deep reinforcement learning (DRL) in a fashion akin to training successful game-playing artificial intelligences (AIs). This proof-of-concept study determines if simulated sepsis (e.g. infection-driven cytokine storm) can be controlled in the absence of effective antimicrobial agents by targeting cytokines for which FDA-approved biologics currently exist.MethodsWe use a previously validated agent-based model, the Innate Immune Response Agent-based Model (IIRABM), for control discovery using DRL. DRL training used a Deep Deterministic Policy Gradient (DDPG) approach with a clinically plausible control interval of 6 hours with manipulation of six cytokines for which there are existing drugs: Tumor Necrosis Factor (TNF), Interleukin-1 (IL-1), Interleukin-4 (IL-4), Interleukin-8 (IL-8), Interleukin-12 (IL-12) and Interferon-γ(IFNg).ResultsDRL trained an AI policy that could improve outcomes from a baseline Recovered Rate of 61% to one with a Recovered Rate of 90% over ~21 days simulated time. This DRL policy was then tested on four different parameterizations not seen in training representing a range of host and microbe characteristics, demonstrating a range of improvement in Recovered Rate by +33% to +56%DiscussionThe current proof-of-concept study demonstrates that significant disease severity mitigation can potentially be accomplished with existing anti-mediator drugs, but only through a multi-modal, adaptive treatment policy requiring implementation with an AI. While the actual clinical implementation of this approach is a projection for the future, the current goal of this work is to inspire the development of a research ecosystem that marries what is needed to improve the simulation models with the development of the sensing/assay technologies to collect the data needed to iteratively refine those models.