Efficient utilization of network resources, particularly channel bandwidth allocation, is critical for optimizing the overall system performance and ensuring fair resource allocation among multiple distributed computing nodes. Traditional methods for channel bandwidth allocation, based on fixed allocation schemes or static heuristics, often need more adaptability to dynamic changes in the network and may not fully exploit the system’s potential. To address these limitations, we employ reinforcement learning algorithms to learn optimal channel allocation policies by intermingling with the environment and getting feedback on the outcomes of their actions. This allows devices to adapt to changing network conditions and optimize resource usage. Our proposed framework is experimentally evaluated through simulation experiments. The results demonstrate that the framework consistently achieves higher system throughput than conventional static allocation methods and state-of-the-art bandwidth allocation techniques. It also exhibits lower latency values, indicating faster data transmission and reduced communication delays. Additionally, the hybrid approach shows improved resource utilization efficiency, efficiently leveraging the strengths of both Q-learning and reinforcement learning for optimized resource allocation and management.