This study leverages the Semantic Segmentation of Underwater Imagery (SUIM) dataset, encompassing over 1,500 meticulously annotated images that delineate eight distinct object categories. These categories encompass a diverse array, ranging from vertebrate fish and invertebrate reefs to aquatic vegetation, wreckage, human divers, robots, and the seafloor. The use of this dataset involves a methodical synthesis of data through extensive oceanic expeditions and collaborative experiments, featuring both human participants and robots. The research extends its scope to evaluating cutting-edge semantic segmentation techniques, employing established metrics to gauge their performance comprehensively. Additionally, we introduce a fully convolutional encoder-decoder model designed with a dual purpose: delivering competitive performance and computational efficiency. Notably, this model boasts a remarkable accuracy of 88%, underscoring its proficiency in underwater image segmentation. Furthermore, this model's integration within the autonomy pipeline of visually-guided underwater robots presents its tangible applicability. Its rapid end-to-end inference capability addresses the exigencies of real-time decision-making, vital for autonomous systems. This study elucidates the model's practical benefits across diverse applications like visual serving, saliency prediction, and intricate scene comprehension. Crucially, the utilization of the Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) elevates image quality, enriching the foundation upon which our model's success rests. This research establishes a solid groundwork for future exploration in underwater robot vision by presenting the model and the benchmark dataset.