Today's communication networks, particularly in the area of massive Machine-Type Communications (mMTC), face challenges such as channel congestion and high power consumption. In this paper introduces the Dynamic Transmission and Delay Optimization Random Access (DTDO-RA) scheme to address these issues. The presented approach adjusts the backoff indicator (BI) value based on the number of transmissions to optimize the success rate of the RA procedure. This is achieved while concurrently reducing channel congestion and power consumption. The strategy involves the application of reinforcement learning algorithms, specifically Q-learning and Deep Deterministic Policy Gradient (DDPG). Algorithms based on these techniques play a crucial role in fine-tuning the BI value and the maximum number of preamble transmissions (Max TX), enabling efficient and responsive channel access management. This paper highlights the critical influence of the BI value on preamble transmission delay and overall access delay. The DTDO-RA method effectively manages the number of preamble transmissions during the RA procedure, expanding the BI range in response to heavy RA traffic from UEs. Reducing channel congestion and power consumption is crucial for enhancing the success rate of the RA procedure. The simulation results in this paper evaluate the power consumption and channel congestion in relation to the number of UEs and the number of Random Access Opportunities (RAOs), providing a comparative perspective on the variations in the BI values achieved by the DTDO-RA scheme. The results of the paper provide insights into the achievement of higher network efficiency, reduction of network congestion and lower power consumption.