Abstract-The challenge of modern financial forecasting comes from the high-dimensionality, nonlinearity, and non-stationarity of financial data. To address the problem, this study employs wavelet analysis to map time domain inputs to time-frequency (or wavelet) domain, and a sparse multi-manifold clustering (SMMC) to partition the high-dimensional feature space into several disjointed regions according to their dynamics. In the final stage, hierarchical multiple kernel machines (HMKM) are employed to perform the high-dimensional forecasting and trading. In our system, SMMC can effectively cluster data on multiple manifolds that are very close to each other, manifolds with non-uniform sampling and holes. HMKM embeds basis kernels in a directed acyclic graph, and optimized them by a graph-adapted sparsity-inducing norm, which performs the feature selection in polynomial time in the number of selected kernels. The empirical results demonstrate that the proposed model outperforms traditional neural networks, support vector machines, statistical models, and significantly reduces the forecasting errors. However, the tight correlations among financial markets provide investors with valuable information to make accurate forecasts regarding the co-movements of stock indices. International investors are a diverse group, operating on very different time scales. As a result, the correlation pattern between international market indices are not fixed between each time scale. Consequently, this study aims to address the problem by a new model in wavelet domain that fully exploits time-frequency features from high-dimensional financial time series. Namely, this study will implement a new forecasting strategy which Another problem in forecasting is that financial time series are usually non-stationary; namely, time series switch their dynamics between different regions. This leads to changes in the dependency structure between input and output variables. Consequently, it is difficult for a single predictor to capture such a switching input-output relationship. Inspired by the so-called "divide-and-conquer" principle that is often used to attack complex problems, the approaches of local modeling have emerged as one of the promising methods of time series prediction (Oh [28]). This study employed a sparse multi-manifold clustering (SMMC) algorithm (Elhamifar and Vidal [14]) for partitioning the feature space into several disjointed regions for different time series dynamics. We then employed an architecture involving multiple experts to overcome the problem, namely, using different experts for different feature regions.
Keywords-ManifoldSparse manifold clustering is outstanding at partitioning data points non-linearly distributed on multiple manifolds. In contrast to traditional nearest neighbors-based methods for manifold modeling, which fix the number of neighbors or the neighborhood radius and then compute the weights between points in each neighborhood, SMMC finds both the neighbors and the weights automatically. SMMC automat...