To facilitate connectivity to the internet, the easiest way to establish communication infrastructure in areas affected by natural disaster and in remote locations with intermittent cellular services and/or lack of Wi-Fi coverage is to deploy an end-to-end connection over Mobile Ad-hoc Networks (MANETs). However, the potentials of MANETs are yet to be fully realized as existing MANETs routing protocols still suffer some major technical drawback in the areas of mobility, link quality, and battery constraint of mobile nodes between the overlay connections. To address these problems, a routing scheme named Mobility, Residual energy and Link quality Aware Multipath (MRLAM) is proposed for routing in MANETs. The proposed scheme makes routing decisions by determining the optimal route with energy efficient nodes to maintain the stability, reliability, and lifetime of the network over a sustained period of time. The MRLAM scheme uses a Q-Learning algorithm for the selection of optimal intermediate nodes based on the available status of energy level, mobility, and link quality parameters, and then provides positive and negative reward values accordingly. The proposed routing scheme reduces energy cost by 33% and 23%, end to end delay by 15% and 10%, packet loss ratio by 30.76% and 24.59%, and convergence time by 16.49% and 11.34% approximately, compared with other well-known routing schemes such as Multipath Optimized Link State Routing protocol (MP-OLSR) and MP-OLSRv2, respectively. Overall, the acquired results indicate that the proposed MRLAM routing scheme significantly improves the overall performance of the network.