A number of recent emerging applications call for studying data streams, potentially infinite flows of information updated in realtime. When multiple co-evolving data streams are observed, an important task is to determine how these streams depend on each other, accounting for dynamic dependence patterns without imposing any restrictive probabilistic law governing this dependence. In this paper we argue that flexible least squares (FLS), a penalized version of ordinary least squares that accommodates for time-varying regression coefficients, can be deployed successfully in this context. Our motivating application is statistical arbitrage, an investment strategy that exploits patterns detected in financial data streams. We demonstrate that FLS is algebraically equivalent to the well-known Kalman filter equations, and take advantage of this equivalence to gain a better understanding of FLS and suggest a more efficient algorithm. Promising experimental results obtained from a FLS-based algorithmic trading system for the S&P 500 Futures Index are reported.
Mean reversion, Statistical arbitrage, Pairs trading, State space model, Time-varying autoregressive processes, Dynamic regression, Bayesian forecasting, 91B84, 91B28, 62M10,
Background. Parametric modeling of survival data is important, and reimbursement decisions may depend on the selected distribution. Accurate predictions require sufficiently flexible models to describe adequately the temporal evolution of the hazard function. A rich class of models is available among the framework of generalized linear models (GLMs) and its extensions, but these models are rarely applied to survival data. This article describes the theoretical properties of these more flexible models and compares their performance to standard survival models in a reproducible case study. Methods. We describe how survival data may be analyzed with GLMs and their extensions: fractional polynomials, spline models, generalized additive models, generalized linear mixed (frailty) models, and dynamic survival models. For each, we provide a comparison of the strengths and limitations of these approaches. For the case study, we compare within-sample fit, the plausibility of extrapolations, and extrapolation performance based on data splitting. Results. Viewing standard survival models as GLMs shows that many impose a restrictive assumption of linearity. For the case study, GLMs provided better within-sample fit and more plausible extrapolations. However, they did not improve extrapolation performance. We also provide guidance to aid in choosing between the different approaches based on GLMs and their extensions. Conclusions. The use of GLMs for parametric survival analysis can outperform standard parametric survival models, although the improvements were modest in our case study. This approach is currently seldom used. We provide guidance on both implementing these models and choosing between them. The reproducible case study will help to increase uptake of these models.
Objectives: Curative treatments can result in complex hazard functions. The use of standard survival models may result in poor extrapolations. Several models for data which may have a cure fraction are available, but comparisons of their extrapolation performance are lacking. A simulation study was performed to assess the performance of models with and without a cure fraction when fit to data with a cure fraction.Methods: Data were simulated from a Weibull cure model, with 9 scenarios corresponding to different lengths of follow-up and sample sizes. Cure and noncure versions of standard parametric, Royston-Parmar, and dynamic survival models were considered along with noncure fractional polynomial and generalized additive models. The mean-squared error and bias in estimates of the hazard function were estimated.Results: With the shortest follow-up, none of the cure models provided good extrapolations. Performance improved with increasing follow-up, except for the misspecified standard parametric cure model (lognormal). The performance of the flexible cure models was similar to that of the correctly specified cure model. Accurate estimates of the cured fraction were not necessary for accurate hazard estimates. Models without a cure fraction provided markedly worse extrapolations.Conclusions: For curative treatments, failure to model the cured fraction can lead to very poor extrapolations. Cure models provide improved extrapolations, but with immature data there may be insufficient evidence to choose between cure and noncure models, emphasizing the importance of clinical knowledge for model choice. Dynamic cure fraction models were robust to model misspecification, but standard parametric cure models were not.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2024 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.