Super-Learning of an Optimal Dynamic Treatment Rule

Luedtke, Alexander R.; Laan, Mark J. van der

doi:10.1515/ijb-2015-0052

Cited by 102 publications

(81 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Before estimating the optimal value, one typically estimates the optimal rule. Recently, researchers have suggested applying machine learning algorithms to estimate the optimal rules from large classes which cannot be described by a finite dimensional parameter [see, e.g., Zhang et al (2012b), Zhao et al (2012), Luedtke and van der Laan (2014)].…”

Section: Introductionmentioning

confidence: 99%

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy

Luedtke¹,

Laan²

2016

Ann. Statist.

Self Cite

146

145

View full text Add to dashboard Cite

We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular and asymptotically linear (RAL) estimator of the optimal value. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-n rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We also outline an extension of our approach to a multiple time point problem. All of our results are supported by simulations.

show abstract

Section: Introductionmentioning

confidence: 99%

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy

Luedtke¹,

Laan²

2016

Ann. Statist.

Self Cite

146

145

View full text Add to dashboard Cite

show abstract

“…Another reason is that the mean reward under the optimal TR seen as a functional, Ψ, is pathwise differentiable at Q 0 if and only if, Q 0 -almost surely, either | q Y, 0 ( W )| > 0 or the conditional distributions of Y given ( A = 1 ,W ) and ( A = 0 ,W ) under Q 0 are degenerated [19, Theorem 1]. This explains why it is also assumed that the true law is not exceptional in [34, 18, 20]. Other approaches have been considered to circumvent the need to make this assumption: relying on m -out-of- n bootstrap [4] (at the cost of a

\sqrt{m} = o (\sqrt{n})

-rate of convergence and need to fine-tune m ), or changing the parameter of interest by focusing on the mean reward under the optimal TR conditional on patients for whom the best treatment has a clinically meaningful effect (truncation) [12, 16, 17].…”

Section: Asymptotiamentioning

confidence: 99%

“…The estimation of the optimal TR from i.i.d. observations has been studied extensively, with a recent interest in the use of machine learning algorithms to reach this goal [24, 36, 37, 34, 35, 28, 20]. In contrast, we estimate the optimal TR (and its mean reward) based on sequentially sampled dependent observations by empirical risk minimization over sample-size-dependent classes of candidate estimates with a complexity controlled in terms of uniform entropy integral.…”

Section: Introductionmentioning

confidence: 99%

Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward

Chambaz¹,

Zheng²,

Laan³

2017

Ann. Statist.

Self Cite

View full text Add to dashboard Cite

This article studies the targeted sequential inference of an optimal treatment rule (TR) and its mean reward in the non-exceptional case, i.e., assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal TR. This data-adaptive statistical parameter is worthy of interest on its own. Our main result is a central limit theorem which enables the construction of confidence intervals on both mean rewards under the current estimate of the optimal TR and under the optimal TR itself. The asymptotic variance of the estimator takes the form of the variance of an efficient influence curve at a limiting distribution, allowing to discuss the efficiency of inference. As a by product, we also derive confidence intervals on two cumulated pseudo-regrets, a key notion in the study of bandits problems. A simulation study illustrates the procedure. One of the corner-stones of the theoretical study is a new maximal inequality for martingales with respect to the uniform entropy integral.

show abstract

“…In van der Laan and Luedtke (2014b), our estimator d n is based on a highly data adaptive super-learner of d 0 developed in Luedtke and van der Laan (2014), so that one might be concerned that the Donsker class condition on d n might be violated theoretically or negatively affect the finite sample coverage of the confidence interval for E 0 Y d n . To deal with this challenge, in van der Laan et al (2013) and van der Laan (2013), van der Laan and Luedtke (2014b) we started a general theory for estimation and inference for data adaptive parameters, such as theorem 2 in van der Laan et al (2013) that avoids any conditions on the estimator

{\overset{g}{true}}^{*}

, beyond convergence to some fixed g * .…”

Section: Statistical Inference For Data Adaptive Target Parameters mentioning

confidence: 99%

Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend on the Natural Value of Treatment Using Observational Data, by Jessica Young, Miguel Hernán, and James Robins

Laan

Luedtke

Díaz

2014

Epidemiologic Methods

Self Cite

View full text Add to dashboard Cite

Young, Hernán, and Robins consider the mean outcome under a dynamic intervention that may rely on the natural value of treatment. They first identify this value with a statistical target parameter, and then show that this statistical target parameter can also be identified with a causal parameter which gives the mean outcome under a stochastic intervention. The authors then describe estimation strategies for these quantities. Here we augment the authors’ insightful discussion by sharing our experiences in situations where two causal questions lead to the same statistical estimand, or the newer problem that arises in the study of data adaptive parameters, where two statistical estimands can lead to the same estimation problem. Given a statistical estimation problem, we encourage others to always use a robust estimation framework where the data generating distribution truly belongs to the statistical model. We close with a discussion of a framework which has these properties.

show abstract

Super-Learning of an Optimal Dynamic Treatment Rule

Cited by 102 publications

References 30 publications

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy

Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward

Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend on the Natural Value of Treatment Using Observational Data, by Jessica Young, Miguel Hernán, and James Robins

Contact Info

Product

Resources

About