Continuous integration (CI) testing is crucial in modern software engineering and test case prioritization (TCP) techniques improve regression testing (RT) by prioritizing test cases (TCs). Various model has been developed to improve TCs failure prediction and prioritization in CI environments. But, prioritizing the TCs on large test suites without loss of information is a major challenging task. To address this, deep reinforcement prioritizer (DeepRP) model is proposed to improve prioritization in TCP on large test suites. This model employs deep reinforcement learning (DRL) model to learn more test case features, such as changes in source code, version control and code coverage. Also, it enhances self-optimization and adaptive ability for TCP. DRL training employs a deep neural network (DNN) structure to approximate various RL functions like value operation, Q function, transformation system and reward function. An RL system called Q-Learning which determines the appropriate action for an agent based on their action-value role. The DeepRP model uses test case features as input data and the priority of the test case as output. The action includes categorising TCs based on given scores, updating evaluations, calculating reward, and storing the chosen score in a temporary vector among the operations. The reward is computed based on the difference among the specified and ideal rankings for improved TCP on large test suites. The actions include sorting TCs based on assigned scores, updating observations, computing a reward and preserving the selected score in a temporary vector. The reward is calculated based on the distance between the assigned and optimal ranks for better TCP on large test suites. Finally, experimental results show DeepRP significantly achieves RMSE values of 0.09, 0.11 and 0.10 on paint control, IOF/ROL and GSDTSR datasets which is lesser than existing models algorithms like Deepgini, Hansie, DeepOrder, LogTCP and RL-TCP models.