In heterogeneous wireless networks, when multiple nodes need to share the same wireless channel, they face the issue of multiple access, which necessitates a Medium Access Control (MAC) protocol to coordinate the data transmission of multiple nodes on the shared communication channel. This paper presents Proximal Policy Optimization-based Multiple Access (PPOMA), a novel multiple access protocol for heterogeneous wireless networks based on the Proximal Policy Optimization (PPO) algorithm from deep reinforcement learning (DRL). Specifically, we explore a network scenario where multiple nodes employ different MAC protocols to access an Access Point (AP). The novel PPOMA approach, leveraging deep reinforcement learning, adapts dynamically to coexist with other nodes. Without prior knowledge, it learns an optimal channel access strategy, aiming to maximize overall network throughput. We conduct simulation analyses using PPOMA in two scenarios: perfect channel and imperfect channel. Experimental results demonstrate that our proposed PPOMA continuously learns and refines its channel access strategy, achieving an optimal performance level in both perfect and imperfect channel scenarios. Even when faced with suboptimal channel conditions, PPOMA outperforms alternative methods by achieving higher overall network throughput and faster convergence rates. In a perfect channel scenario, PPOMA’s advantage over other algorithms is primarily evident in its convergence speed, reaching convergence on average 500 iterations faster. In an imperfect channel scenario, PPOMA’s advantage is mainly reflected in its higher overall network throughput, with an approximate increase of 0.04.