In recent years, unmanned surface vehicles (USVs) have made important advances in civil, maritime, and military applications. With the continuous improvement of autonomy, the increasing complexity of tasks, and the emergence of various types of advanced sensors, higher requirements are imposed on the computing performance of USV clusters, especially for latency sensitive tasks. However, during the execution of marine operations, due to the relative movement of the USV cluster nodes and the network topology of the cluster, the wireless channel states are changing rapidly, and the computing resources of cluster nodes may be available or unavailable at any time. It is difficult to accurately predict in advance. Therefore, we propose an optimized offloading mechanism based on the classic multi-armed bandit (MAB) theory. This mechanism enables USV cluster nodes to dynamically make offloading decisions by learning the potential computing performance of their neighboring team nodes to minimize average computation task offloading delay. It is an optimized algorithm named Adaptive Upper Confidence Boundary (AUCB) algorithm, and corresponding simulations are designed to evaluate the performance. The algorithm enables the USV cluster to effectively adapt to the marine vehicular fog computing networks, balancing the trade-off between exploration and exploitation (EE). The simulation results show that the proposed algorithm can quickly converge to the optimal computation task offloading combination strategy under heavy and light input data loads.