The 6LoWPAN (IPv6 over low-power wireless personal area networks) standard enables resource-constrained devices to connect to the IPv6 network, blending an IPv6 header compression protocol. For this network technology, a new routing protocol called Routing Protocol for Low Power Lossy network (RPL) has been designed. The latter is a lightweight protocol that determines the route across the nodes based on rank values. This protocol is known to be non-resilient against Rank attacks, which aim at creating non-optimized routes for packet forwarding, hence overwhelming the constrained 6LoWPAN. With 5G, Software-Defined Networks (SDNs) have been developed to facilitate simple programmable control plane, Quality of Service (QoS) provisioning, and route configuration services for 6LoWPAN. However, there is still a lack of optimization mechanisms to protect 6LoWPAN against Rank attacks in SDN-based deployment. To this end, in this paper, a Reinforcement-Learning (RL) agent is leveraged to assist and complement an SDN controller in achieving cost-efficient route optimization, and QoS provisioning packet forwarding to prevent rank attacks. Experimental results confirm that our approach effectively prevents Rank attacks while providing an adequate delay and radio duty cycle. Meanwhile, it maximizes the packet delivery ratio, facilitating practical implementations in software-defined Low Power Internet of Things (IoT) networks.