Abstract-Various spectrum management schemes have been proposed in recent years to improve the spectrum utilization in cognitive radio networks. However, few of them have considered the existence of cognitive attackers who can adapt their attacking strategy to the time-varying spectrum environment and the secondary users' strategy. In this paper, we investigate the security mechanism when secondary users are facing the jamming attack, and propose a stochastic game framework for anti-jamming defense. At each stage of the game, secondary users observe the spectrum availability, the channel quality, and the attackers' strategy from the status of jammed channels. According to this observation, they will decide how many channels they should reserve for transmitting control and data messages and how to switch between the different channels. Using the minimax-Q learning, secondary users can gradually learn the optimal policy, which maximizes the expected sum of discounted payoffs defined as the spectrum-efficient throughput. The proposed stationary policy in the anti-jamming game is shown to achieve much better performance than the policy obtained from myopic learning, which only maximizes each stage's payoff, and a random defense strategy, since it successfully accommodates the environment dynamics and the strategic behavior of the cognitive attackers.
Taking into account the results of historical behavior in real life on the choice of strategy, we proposes a payoff reflection mechanism. In the process of cooperation evolution, the game individuals will judge whether to learn each other's strategies based on the historical payoff of himself and his neighbors, that is, the strategy learning is affected by the historical payoff. If the current strategy can bring greater payoff in historical behavior, the game players will change the current strategy with less possibility. For this reason, the historical payoff reference coefficients w and u of the game individuals and their neighbors are proposed to measure the degree of reference of game individuals to their own and neighbors' historical payoff. The memory interval length is expressed by M. The experimental results show that the payoff reflection mechanism can greatly improve the cooperation level of the group. However, the reference rate of historical payoff is not the bigger the better. The strength of memory ability and the level of betrayal temptation will affect the optimal value of w and u.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.