With the rapid increase in the number of wireless sensor terminals in smart grids, backscattering has become a very promising green technology. By means of backscattering, wireless sensors can either reflect energy signals in the environment to exchange information with each other or capture the energy signals to recharge their batteries. However, the changing environment around wireless sensors, limited radio frequency and various service priorities in uplink communications bring great challenges in allocation resources. In this paper, we put forward a backscatter communication model based on business priority and cognitive network. In order to achieve optimal throughput of system, an asynchronous advantage actor-critic (A3C) algorithm is designed to tackle the problem of uplink resource allocation. The experimental results indicate that the presented scheme can significantly enhance overall system performance and ensure the business requirements of high-priority users.Electronics 2020, 9, 622 2 of 16 compared to the active WPCN. The authors of [4] maximized the throughput of backscatter and radio frequency (RF) WSNs by formulating the optimization objectives.Harvest-the-transmit (HTT) protocol is employed in a traditional RF-powered cognitive network (CRN) where the secondary transmitter (ST) harvests the energy in primary signals and then transmits information to the receiver using the energy stored in the battery actively. Recently, ambient energy signals have been used to transmit ST information by backscattering. The overhead of transmitting data is further reduced. Therefore, it is advisable to combine ambient backscatter communication with RF-powered CRN. The authors of [5] analyzed and compared the performance and throughput of RF-powered CRN in overlay and underlay scenarios. Internet of Things (IoT) is regarded as an ultimate solution to connect everything and has aroused extensive interest in academia and industry. However, energy limitation and implementation cost are the challenges in the construction of IoT. In [6], ambient backscatter communication (AmBC) was introduced as a green communication paradigm that could exploit the ambient wireless signals to power sensors and backscatter the information.There have been many optimization problems for RF-powered backscatter networks. The authors of [7] studied a full-duplex communication mode in the AmBC network where a full-duplex communication device transmitted the energy signal while the receiving signal was reflected back from the backscatter devices (BDs). An iterative algorithm was proposed to optimize the minimum throughput for all BDs by jointly scheduling time, power resources and adjustment reflection coefficients. The authors of [8] introduced backscatter communication into a multi-user device-to-device (D2D) network where the device could harvest energy from RF signals. A game-theoretic approach was proposed to balance relaying performance and energy harvesting.The above algorithms need to know the complete environment information. However, ...