High dimensional time series datasets are becoming increasingly common in various fields such as economics, finance, meteorology, and neuroscience. Given this ubiquity of time series data, it is surprising that very few works on variable screening discuss the time series setting, and even fewer works have developed methods which utilize the unique features of time series data. This paper introduces several model free screening methods based on the partial distance correlation and developed specifically to deal with time dependent data. Methods are developed both for univariate models, such as nonlinear autoregressive models with exogenous predictors (NARX), and multivariate models such as linear or nonlinear VAR models. Sure screening properties are proved for our methods, which depend on the moment conditions, and the strength of dependence in the response and covariate processes, amongst other factors. Dependence is quantified by functional dependence measures (Wu, 2005) and β-mixing coefficients, and the results rely on the use of Nagaev and Rosenthal type inequalities for dependent random variables. Finite sample performance of our methods is shown through extensive simulation studies, and we include an application to macroeconomic forecasting.
This supplementary document is organized as follows. Section 1 contains a detailed comparison between partial distance correlation and conditional distance correlation measures.Section 2 contains additional empirical results of our application to forecasting market returns. Section 3 contains the sure screening properties, simulations, as well as a real data application of the group PDC-SIS procedure. Section 4 contains the proofs of Theorems 1 and 2 in the main paper. Lastly, Section 5 provide more detailed results of the simulations shown in the main paper.1 Partial DC vs. Conditional DC Compared to the conditional DC, partial DC has a number of advantages when it comes to constructing screening methods.First, partial DC can be easily computed using pairwise distance correlations and is much more computationally tractable when dealing with a large number of predictors. Computing conditional DC is more complicated; therefore using conditional DC-based screening procedure has a much higher computational burden. More importantly, the computation of conditional DC involves the choice of a bandwidth matrix to compute a kernel density estimate for the conditioning vector. Selecting this bandwidth matrix is difficult in practice, especially for multivariate conditioning vectors where the curse of dimensionality rapidly deteriorates the quality of our estimates.In order to illustrate these effects, consider the following simple simulation example: we generate n = 100 random observations from Y t = 6 j=1 β j X t−1,j + t , where t i.i.d.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.