Cognitive radio enables secondary users (SUs) to explore and exploit the underutilized licensed channels (or white spaces) owned by the primary users. To improve the network scalability, the SUs are organized into clusters. This article proposes a novel artificial intelligence based trust model approach that uses reinforcement learning (RL) to improve traditional budget-based cluster size adjustment schemes. The RL-based trust model enables the clusterhead to observe and learn about the behaviors of its SU member nodes, and revoke the membership of malicious SUs in order to ameliorate the effects of intelligent and collaborative attacks, while adjusting the cluster size dynamically according to the availability of white spaces. The malicious SUs launch attacks on clusterheads causing the cluster size to become inappropriately sized while learning to remain undetected. In any attack and defense scenario, both the attackers and the clusterhead adopt RL approaches. Simulation results have shown that the single-agent RL (SARL) attackers have caused the cluster size to reduce significantly; while the SARL clusterhead has slightly helped increase its cluster size, and this motivates a rule-based approach to efficiently counterattack. Multi-agent RL attacks have shown to be less effective in an operating environment that is dynamic.Index Terms-Cognitive radio ad hoc networks, clustering methods, trust management, artificial intelligence, security.