This paper proposes a two-dimensional resource allocation technique for vehicle-toinfrastructure (V2I) communications. Vehicular communications requires high data rates, low latency, and reliability simultaneously. The 3rd generation partnership project (3GPP) included various numerologies to support this, leading to diversification of transmit time interval (TTI). It enables the two-dimensional resource allocation that considers time and frequency simultaneously, which has yet to be studied much. To tackle this issue, we propose a reinforcement learning approach to solve the two-dimensional resource allocation problem for V2I communications. A reinforcement learning agent in a base station allocates a quality of service (QoS) guaranteed two-dimensional resource block to each vehicle to maximize the sum of achievable data quantity (ADQ). It exploits received power information and a resource occupancy status as input. It outputs vehicles' allocation information that consists of a time-frequency position, bandwidth, and TTI, which is a solution to the two-dimensional resource allocation. The simulation results show that the proposed method outperforms the fixed allocation method. Because of the ability to pursue ADQ maximization and QoS guarantee, the proposed method performs better than an optimization-based benchmark method if each vehicle has a QoS constraint. Also, we can see that the resource the agent selects according to the QoS constraint varies and maximizes the ADQ.INDEX TERMS Deep reinforcement learning, V2X communications, quality of service, resource allocation.
I. INTRODUCTIONWith the advent of complex applications that combine high data rates, low latency, or high reliability, discussions on next-generation communication networks have been actively conducted to support them. The international telecommunication union (ITU) radiocommunication sector has defined three service types to meet the requirements of new applications: enhanced mobile broadband (eMBB) for applications requiring high data rates, massive machine-type communications (mMTC) for applications requiring high-The associate editor coordinating the review of this manuscript and approving it for publication was Hao Wang . density networks, and ultra-reliable low-latency communications (URLLC) for latency-sensitive applications. Vehicleto-everything (V2X) communications is a highly complex application requiring all three service types. It consists of several vehicle-related communications, such as vehicle-toinfrastructure (V2I) and vehicle-to-vehicle (V2V) communications. Vehicles utilize V2V communications for direct information exchange between themselves and V2I communications to convey information to the infrastructure such as base stations or roadside units (RSUs) and vice versa [1], [2], [3], [4].Early vehicular communications focused on collision avoidance to reduce car crashes. They need delay-sensitive