2019
DOI: 10.1186/s12911-019-0755-6
|View full text |Cite
|
Sign up to set email alerts
|

Incorporating causal factors into reinforcement learning for dynamic treatment regimes in HIV

Abstract: Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in health care domains. However, existing studies simply apply naive RL algorithms in discovering optimal treatment strategies for a targeted problem. This kind of direct applications ignores the abundant causal relationships between treatment options and the associated outcomes that are inherent in medical domains. Methods This paper investigates h… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2
1
1

Relationship

2
8

Authors

Journals

citations
Cited by 18 publications
(7 citation statements)
references
References 11 publications
0
7
0
Order By: Relevance
“…The majority of research using RL in healthcare is in dynamic treatment regimes, where the goal is to develop effective treatment regimes that can dynamically adapt to the varying clinical states and improve the long-term outcomes for patients (Yu et al, 2019b). This includes DTR for diseases such as cancer (Zhao, Kosorok, & Zeng, 2009;Liu, Logan, Liu, Xu, Tang, & Wang, 2017), diabetes (Daskalaki, Scarnato, Diem, & Mougiakakou, 2010;Bothe, Dickens, Reichel, Tellmann, Ellger, Westphal, & Faisal, 2013;Daskalaki, Diem, & Mougiakakou, 2013), anemia (Malof & Gaweda, 2011;Escandell-Montero, Chermisi, Martinez-Martinez, Gomez-Sanchis, Barbieri, Soria-Olivas, Mari, Vila-Francés, Stopper, Gatti, et al, 2014), HIV (Parbhoo, 2014;Parbhoo, Bogojeska, Zazzi, Roth, & Doshi-Velez, 2017;Yu, Dong, Liu, & Ren, 2019a), mental illnesses (Paredes, Gilad-Bachrach, Czerwinski, Roseway, Rowan, & Hernandez, 2014;Pineau, Guez, Vincent, Panuccio, & Avoli, 2009), and DTR in critical care (Weng, Gao, He, Yan, & Szolovits, 2017;Petersen, Yang, Grathwohl, Cockrell, Santiago, An, & Faissol, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…The majority of research using RL in healthcare is in dynamic treatment regimes, where the goal is to develop effective treatment regimes that can dynamically adapt to the varying clinical states and improve the long-term outcomes for patients (Yu et al, 2019b). This includes DTR for diseases such as cancer (Zhao, Kosorok, & Zeng, 2009;Liu, Logan, Liu, Xu, Tang, & Wang, 2017), diabetes (Daskalaki, Scarnato, Diem, & Mougiakakou, 2010;Bothe, Dickens, Reichel, Tellmann, Ellger, Westphal, & Faisal, 2013;Daskalaki, Diem, & Mougiakakou, 2013), anemia (Malof & Gaweda, 2011;Escandell-Montero, Chermisi, Martinez-Martinez, Gomez-Sanchis, Barbieri, Soria-Olivas, Mari, Vila-Francés, Stopper, Gatti, et al, 2014), HIV (Parbhoo, 2014;Parbhoo, Bogojeska, Zazzi, Roth, & Doshi-Velez, 2017;Yu, Dong, Liu, & Ren, 2019a), mental illnesses (Paredes, Gilad-Bachrach, Czerwinski, Roseway, Rowan, & Hernandez, 2014;Pineau, Guez, Vincent, Panuccio, & Avoli, 2009), and DTR in critical care (Weng, Gao, He, Yan, & Szolovits, 2017;Petersen, Yang, Grathwohl, Cockrell, Santiago, An, & Faissol, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…Other approaches applied various kinds of TRL techniques so as to take advantage of the prior information from previously learned transition models [149], [150] or learned policy [151]. More recently, Yu et al [152] proposed a causal policy gradient algorithm and evaluated it in the treatment of HIV in order to facilitate the final learning performance and increase explanations of learned strategies.…”
Section: A Chronic Diseasesmentioning
confidence: 99%
“…We use a previously validated Progression and Transmission of HIV (PATH 2.0) model [ 26 ], a dynamic stochastic agent-based model, to simulate the epidemic and evaluate the decisions. Previous RL models in HIV have focused on patient-level clinical decisions such as optimal treatment protocols [ 27 , 28 ]. Recent literature has seen an emergence in the use of RL for public health decision-making related to the COVID-19 pandemic, but they predominantly use deterministic equation-based model environments [ 29 – 34 ].…”
Section: Introductionmentioning
confidence: 99%