For the purpose of overcoming the challenges of charging wireless sensors in the complicated industrial environment, researchers are concentrating more and more on sensor networks that can harvest energy.This paper looks at a wirelessly powered industrial sensor network where each sensor harvests energy from a specific radio frequency (RF) energy source and uses it to transmit data to a receiver.Two working modes are discussed of in this paper.One is the frequency division multiplexing (FDM) working mode, where the sensor simultaneously transmits data over orthogonal frequency bands while harvesting RF energy.Time division multiplexing (TDM), which divides each time slot into two successive intervals, is the second working mode.Data is transmitted and energy is harvested in the same frequency band, but at distinct intervals.Because the channel condition and energy harvesting process are unpredictable, an efficient resource allocation algorithm is required for the sensors.We propose a novel resource allocation algorithm based on reinforcement learning.The proposed algorithm achieves continuous resource allocation and is applicable for continuous states by using Proximal Policy Optimization (PPO).We also utilize entropy regularization, online normalization of state, reward scaling, and advantage normalization to improve the performance of resource allocation algorithm in real-world scenarios.In both FDM and TDM working modes, the proposed algorithm outperforms the greedy algorithm and random algorithm in terms of long-term throughput, according to the results of numerical simulations.