As the demand for sensing and monitoring the marine environment increases, the Ocean Mobile Internet of Things (OM-IoT) has gradually attracted the interest of researchers. However, the unreliability of communication links represents a significant challenge to data transmission in the OM-IoT, given the complex and dynamic nature of the marine environment, the mobility of nodes, and other factors. Consequently, it is necessary to enhance the reliability of underwater data transmission. To address this issue, this paper proposes a reinforcement learning-based adaptive network coding (RL-ANC) approach. Firstly, the channel conditions are estimated based on the reception acknowledgment, and a feedback-independent decoding state estimation method is proposed. Secondly, the sliding coding window is dynamically adjusted based on the estimates of the channel erasure probability and decoding probability, and the sliding rule is adaptively determined using a reinforcement learning algorithm and an enhanced greedy strategy. Subsequently, an adaptive optimization method for coding coefficients based on reinforcement learning is proposed to enhance the reliability of the underwater data transmission and underwater network coding while reducing the redundancy in the coding. Finally, the sampling period and time slot table are updated using the enhanced simulated annealing algorithm to optimize the accuracy and timeliness of the channel estimation. Simulation experiments demonstrate that the proposed method effectively enhances the data transmission reliability in unreliable communication links, improves the performance of underwater network coding in terms of the packet delivery rate, retransmission, and redundancy transmission ratios, and accelerates the convergence speed of the decoding probability.