“…We compute SVM forecasters using memories of increasing length and for training data of two different sizes n = 800 and n = 1600. Since in estimating the convergence speed of (7) we are using a loose concentration result, a suitable regularization parameter and kernel width sequence cannot be chosen a-priori. Hence, we have adopted a grid search in (λ, γ) space and a 4-fold cross-validation technique [5] to choose (λ n , γ n ) for a given sample size n. Finally, we use (λ n , γ n ) for an estimate f n,1,1,λn,γn constructed from (6) using the whole sample T n (to simplify notation, we henceforward omit the dependence of f n,1,1 on the regularization parameters (λ n , γ n )).…”