Abstract-We consider sequential regression of individual sequences under the square error loss. Using a competitive algorithm framework, we construct a sequential algorithm that can achieve the performance of the best piecewise (in time) linear regression algorithm tuned to the underlying individual sequence. The sequential algorithm we construct does not need the data length, number of piecewise linear regions, or the locations of the transition times, however, it can asymptotically achieve the performance of the best piecewise (in time) linear regressor that can choose number of segments, duration of these segments and best regressor in each segment, based on observation of the whole sequence in advance. We use a transition diagram similar to that of [Willems '96] to effectively combine an exponential number of competing algorithms, with a complexity that is only linear in the data length. We demonstrate that the regret of this approach is at most O(4 ln(n)) per transition for not knowing the best transition times and at most O(ln(n)) for estimating the best linear regressor in each segment, where n is the total length of the observation process. Lower bounds for any sequential algorithm demonstrate a form of minmax optimality in certain settings. We then extend these results to include a finite collection of competing algorithms within each time segment, rather than linear regressors.
I. INTRODUCTIONA common approach to applications in adaptive signal processing is to take the viewpoint of turning the problem at hand, such as equalization, prediction or some other sequential decision problem, and turn it into an associated parametric modeling or estimation problem. By forcing the problem into this form, we then have to live with the associated performance of the resulting parameter estimation problem, which in general is worse, often significantly worse, than that which could have been obtained by directly addressing the problem at hand. Moreover, if the assumptions in the model do not match reality, then the performance of the algorithm tuned to the assumed statistical model may deteriorate considerably.In this paper, we approach the problem of prediction from a competitive algorithm point of view. By defining a competitive framework, we try to achieve the performance of the best algorithm from a large class of candidate algorithms, rather than attempting to fit a given model to the data at hand. The performance measure of interest is then defined with respect to the best from this class, instead of the usual parametric modeling error between the output of the modeling algorithm and the desired signal directly. We will show that by not forcing the algorithms to make hard decisions about a set of parameters at each step, but rather permitting a competition among many candidate models, we can obtain algorithms that