Background
In large multiregional cohort studies, survival data is often collected at small geographical levels (such as counties) and aggregated at larger levels, leading to correlated patterns that are associated with location. Traditional studies typically analyze such data globally or locally by region, often neglecting the spatial information inherent in the data, which can introduce bias in effect estimates and potentially reduce statistical power.
Method
We propose a Geographically Weighted Accelerated Failure Time Model for spatial survival data to investigate spatial heterogeneity. We establish a weighting scheme and bandwidth selection based on quasi-likelihood information criteria. Theoretical properties of the proposed estimators are thoroughly examined. To demonstrate the efficacy of the model in various scenarios, we conduct a simulation study with different sample sizes and adherence to the proportional hazards assumption or not. Additionally, we apply the proposed method to analyze ovarian cancer survival data from the Surveillance, Epidemiology, and End Results cancer registry in the state of New Jersey.
Results
Our simulation results indicate that the proposed model exhibits superior performance in terms of four measurements compared to existing methods, including the geographically weighted Cox model, when the proportional hazards assumption is violated. Furthermore, in scenarios where the sample size per location is 20-25, the simulation data failed to fit the local model, while our proposed model still demonstrates satisfactory performance. In the empirical study, we identify clear spatial variations in the effects of all three covariates.
Conclusion
Our proposed model offers a novel approach to exploring spatial heterogeneity of survival data compared to global and local models, providing an alternative to geographically weighted Cox regression when the proportional hazards assumption is not met. It addresses the issue of certain counties' survival data being unable to fit the model due to limited samples, particularly in the context of rare diseases.