The full-duplex transmission protocol has been widely investigated in the literature in order to improve radio spectrum usage efficiency. Unfortunately, due to the effect of imperfect self-interference suppression, the change in transmission power and path loss of non-line-of-sight fading channels will strongly affect performance of full-duplex transmission mode. This entails that the full-duplex transmission protocol is not always a better selection compared to the traditional half-duplex transmission protocol. Considering solar energy-harvesting-powered cognitive radio networks (CRNs), we investigate a joint full-duplex/half-duplex transmission switching scheduling and transmission power allocation in which we utilize the advantages of both half-duplex and full-duplex transmission modes for maximizing the long-term throughput of cognitive radio networks. First, we formulate the transmission rate of half-duplex and full-duplex links for fading channels between cognitive user and base station in which the channel gain is assumed to follow an exponential distribution. Afterward, by considering the availability probability of the primary channel, the limitation of the energy-harvesting capacity of the cognitive user, and the transmission capacity of half-duplex and full-duplex links, we describe the problem in terms of long-term expected throughput. The problem is then solved by adopting the partially observable Markov decision process framework to find the optimal transmission policy for the transmission pair between cognitive user and base station in order to maximize the long-term expected throughput. The optimal policy consists of either the half-duplex or the full-duplex transmission protocols as well as the corresponding amount of transmission energy in each time slot. In addition, to reduce the complexity in formulation and calculation, we also apply the actor–critic-based learning method to solve the considered problem. Finally, the performance of the proposed scheme was evaluated by comparing it with a conventional scheme in which the context of energy harvesting and long-term throughput is not considered.