In an autonomous underwater vehicles– (AUVs–) based optical-acoustic hybrid network, it is critical to achieve ultra high-speed reliable communications, in order to reap the benefits of the complementary systems and perform high-bandwidth and low-latency operations. However, as the mobile AUVs operate in harsh oceanic environments, it is essential to design an effective switching algorithm to execute flexible hybrid acoustic-optical communications and increase the network throughput. In this paper, we propose a Q-learning-based adaptive switching scheme to maximize the network throughput by capturing the dynamics of the varying channels as well as the mobility of AUVs. In order to address the challenge associated with partial observations of the optical channel and improve the switching efficiency in extreme conditions, a blind optical channel estimation method is designed and implemented with the Extended Kalman Filter (EKF), in which the relationship between the underwater acoustic and optical channels is utilized to improve the channel prediction accuracy. Based on this environmental status, a reinforcement learning approach is leveraged to build a near-optimal switching strategy for the hybrid network. We conduct numerical simulations to verify the performance of the scheme, and the simulation results demonstrate that the proposed switching scheme is effective and robust.