The proportional hazards (PH) model is arguably one of the most popular models used to analyze time to event data arising from clinical trials and longitudinal studies, among many others. In many such studies, the event time of interest is not directly observed but is known relative to periodic examination times; i.e., practitioners observe either current status or interval-censored data. The analysis of data of this structure is often fraught with many difficulties. Further exacerbating this issue, in some such studies the observed data also consists of instantaneous failures; i.e., the event times for several study units coincide exactly with the time at which the study begins. In light of these difficulties, this work focuses on developing a mixture model, under the PH assumptions, which can be used to analyze interval-censored data subject to instantaneous failures. To allow for modeling flexibility, two methods of estimating the unknown cumulative baseline hazard function are proposed; a fully parametric and a monotone spline representation are considered. Through a novel data augmentation procedure involving latent Poisson random variables, an expectation-maximization (EM) algorithm was developed to complete model fitting. The resulting EM algorithm is easy to implement and is computationally efficient. Moreover, through extensive simulation studies the proposed approach is shown to provide both reliable estimation and inference.Keywords EM algorithm · instantaneous failure data · interval-censored data · monotone splines · proportional hazards model.
IntroductionInterval-censored data commonly arise in many clinical trials and longitudinal studies, and is characterized by the fact that the event time of interest is not directly observable, but rather is known relative to observation times. As a special case, current status data (or case-1 interval censoring) arise when there exists exactly one observation time per study unit; i.e., at the observation time one discovers whether or not the event of interest has occurred. Data of this structure often occurs in resource limited environments or due to destructive testing. Alternatively, general interval-censored data (or case-2 interval censoring) arise when multiple observation times are available for each study unit, and the event time can be ascertained relative to two observation times. It is well known that ignoring the structure of interval-censored data during an analysis can lead to biased estimation and inaccurate A semiparametric regression cure model for interval-censored data.