Cognitive radio (CR) is an adaptive radio technology that can automatically detect available channels in a wireless spectrum and change transmission parameters to improve radio operating behavior. Due to the dynamic nature of spectrum availability and wireless channel condition, it is very hard to maintain reliable network connectivity. Cluster-based CR ad-hoc networks (CRAHN) arrange CR nodes into groups to effectively maintain reliable autonomous networks. Clustering in CRAHN supports cooperative tasks such as spectrum sensing and channel managements and achieves network scalability and stability. In this paper, we proposed a Q-learning based cluster formation approach in CRAHN, in which Q-value is used to evaluate each node's channel quality. To form a distributed cluster network, channel quality, residual energy and neighbor node/network conditions are considered. By exchanging each node's status information in terms of channels and neighbors, each node knows neighboring topology and which node is the best candidate for cluster head (CH). Distributed CH selection, the optimum common active data channel decision, and gateway node selection procedures are presented in this paper. The proposed mechanism can extend the network lifetime, enhance the reachability not only between member nodes but also with other cluster networks, it can also provide stable and reliable service using the selected data channel and avoid possible interference between neighboring ad-hoc clusters. INDEX TERMS Reinforcement learning, clustering, cognitive radio, Q-learning, ad-hoc network.