Using visual sensors for detecting regions of interest in underwater environments is fundamental for many robotic applications. Particularly, for an autonomous exploration task, an underwater vehicle must be guided towards features that are of interest. If the relevant features can be seen from the distance, then smooth control movements of the vehicle are feasible in order to position itself close enough with the final goal of gathering visual quality images. However, it is a challenging task for a robotic system to achieve stable tracking of the same regions since marine environments are unstructured and highly dynamic and usually have poor visibility. In this paper, a framework that robustly detects and tracks regions of interest in real time is presented. We use the chromatic channels of a perceptual uniform color space to detect relevant regions and adapt a visual attention scheme to underwater scenes. For the tracking, we associate with each relevant point superpixel descriptors which are invariant to changes in illumination and shape. The field experiment results have demonstrated that our approach is robust when tested on different visibility conditions and depths in underwater explorations.