The long term tracking of sparse local features in an image is important for many applications including camera calibration for stereo applications, camera or global motion estimation and people surveillance. The majority of existing tracking frameworks are based on some kind of prediction/correction idea e.g. KLT and Particle Filters. However, given a careful selection of interest points throughout the sequence, the problem of tracking can be solved with the Viterbi algorithm. This work introduces a novel approach to interest point selection for tracking using the Mean Shift algorithm over short time windows. The resulting points are then articulated within a Viterbi algorithm for creating very long term tracking data. The tracks are shown to be more accurate than traditional KLT implementations and also do not suffer from accumulation of error with time.