There is extensive literature on the estimation of the optimal individualized treatment regime in a survival context, which dictates treatment to maximize the expected survival probability. Those methods are based on the key assumption that we can collect all the confoundings in the observational studies or in the randomized trials with noncompliance. However, the assumption sometimes is too restrictive to be applied and the violation would yield bias on the estimation of the optimal regime. In the article, we propose a method to learn the optimal regime when some of the confoundings are not observed and a valid binary instrumental variable is available. Specifically, we establish the estimator for the potential survival function under any given treatment regime and for the optimal regime by maximizing the potential survival function under a prespecified class of regimes. We also propose the doubly robust estimator to avoid possibly wrong assignment of the nuisance model. Since the estimators of the potential survival function is jagged, we utilize the kernel smoothed technique to relieve the burden of the optimization. Asymptotic properties of the proposed estimators are provided, moreover, simulation results confirm the finite sample performance when unmeasured confounding exists. Our methods are also examined and illustrated by a real-world example to dictate personalized colorectal cancer screening.
Missing data is a common problem in clinical data collection, which causes difficulty in the statistical analysis of such data. In this article, we consider the problem under a framework of a semiparametric partially linear model when observations are subject to complex missing pattern covariates. One natural question in the partially linear model is the choice of model structure, that is, how to decide which covariates are linear and which are nonlinear. If the correct model structure of the partially linear model is available, we propose to use a new imputation method called Partial Replacement IMputation Estimation (PRIME), which can overcome problems caused by incomplete data in the partially linear model. In the more challenging setting where the model structure is unknown a priori, we use PRIME in conjunction with model averaging (PRIME-MA) to adapt to the unknown model structure in the partially linear model. In simulation studies, we use various error distributions, sample sizes, missing data rates, covariate correlations, and noise levels, and PRIME outperforms other methods in almost all cases. With an unknown correct model structure, PRIME-MA has satisfactory performance in terms of prediction. Moreover, we conduct a study of influential factors in the Chinese Provincial Legal Funding Dataset from the Harvard Dataverse, which shows that our method performs better than the other models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.