It is apparent that various internet services in today’s digital ecosystem effectuate different types of networks’ quality of services (QoS) requirements. This condition, in fact, adds another level of complexity to the current network congestion control protocols. Therefore, it drives the adoption of deep reinforcement learning to improve the protocols’ adaptability to the dynamic networks’ QoS requirements. In this case, the state-of-the-art works on congestion control protocols, formulate the markov decision process (MDP) by transforming the congestion control pattern from the saw tooth congestion window to the staircase sending rate per-interval cycles. This approach treats congestion control as a sequential decision-making process that fits reinforcement learning. However, the interval configuration parameter that gives the optimum QoS has not been empirically studied. In this work, we present an extensive study on various interval configuration parameters for the deep reinforcement learning-based congestion control agent. Our work shows that various interval configuration, which consists of the RTT estimator and the n parameter, results in different QoS. The experiment shows that the RTTjk has significantly higher throughput than RTTewma and RTTmin−filtered in various network conditions. Furthermore, we found that the RTTjk with n = 2.0 is superior to other configurations in almost all networking scenarios. Whereas the RTTjk with n = 1.0 is optimal for a network environment with fixed bandwidth scenario.