Existing transportation infrastructure and traffic control systems face increasing strain as a result of rising demand, resulting in frequent congestion. Expanding infrastructure is not a feasible solution for enhancing the capacity of the road. Hence, Intelligent Transportation Systems are often employed to enhance the Level of Service (LoS). One such approach is Variable Speed Limit (VSL) control. VSL increases the LoS and safety on motorways by optimizing the speed limit according to the traffic conditions. The proliferation of Connected and Autonomous Vehicles (CAVs) presents fresh prospects for improving the operation and measurement of traffic states for the operation of the VSL control system. This paper introduces a method for the detection of multiple congested areas that is used for state estimation for a dynamically positioned VSL control system for urban motorways. The method utilizes Q-Learning (QL) and CAVs as mobile sensors and actuators. The proposed control approach, named Congestion Detection QL Dynamic Position VSL (CD-QL-DPVSL), dynamically detects all of the congested areas and applies two sets of actions, involving the dynamic positioning of speed limit zones and imposed speed limits for all detected congested areas simultaneously. The proposed CD-QL-DPVSL control approach underwent an evaluation across six distinct traffic scenarios, encompassing CAV penetration rates spanning from 10% to 100% and demonstrated a significantly better performance compared to other control approaches, including no control, rule-based VSL, two Speed-Transition-Matrices-based QL-VSL configurations with fixed speed limit zone positions, and a Speed-Transition-Matrices-based QL-DVSL with a dynamic speed limit zone position. It achieved enhancements in macroscopic traffic parameters such as the Mean Travel Time and Total Time Spent by adapting its control policy to every simulated scenario.