Over the previous decade, there has been a significant focus on researching underwater acoustic sensor networks (UW-ASNs) for a diverse range of underwater applications, which in turn has facilitated human exploration of the expansive underwater environment. This research introduces an innovative architectural approach that signifies a noteworthy advancement. By combining both acoustic and optical components, it establishes an underwater wireless sensor network. Additionally, the research introduces an innovative multiple levels Q learning-grounded direction-finding procedure, denoted as the proposed system Multi-layer Guidance Approach (MLGA) which is meticulously tailored for such underwater networks. The network's architecture encompasses both physical grouping and logical division into two tiers: the upper tier is overseen by group leaders responsible for managing routing within the lower tier, where group members execute the actual data packet routing. This design capitalizes on the wider viewpoint of upper-tier group leaders and the concurrent learning processes occurring across all groups, resulting in a substantial enhancement in routing efficiency in comparison with traditional methodologies. The empirical results obtained from experimental tests underscore the robustness of the proposed system when confronted with changes in network topology. Moreover, it showcases the system's ability to achieve higher delivery rates and reduced delays in dynamic networks compared to the established approach of flat Q-learning routing. This innovative strategy holds the potential to significantly push the boundaries of underwater sensor networks, surpassing the constraints of conventional communication methods and providing a more effective and dependable means of transmitting data underwater. This advancement not only contributes to the technical aspects but also holds promise for fostering greater exploration and understanding of underwater environments.