In this paper, we apply a Q‐learning algorithm to carry out slot assignment for machine type communication devices (MTCDs) in machine‐to‐machine communication. We first make use of a K‐means clustering algorithm to overcome the congestion problem in an machine‐to‐machine network where each MTCD is associated with one controller. Subsequently, we formulate the slot selection problem as an optimisation problem. Then, we present a solution using the Q‐learning algorithm to select conflict‐free slot assignment in a random access network with MTCD controllers. The performance of the solution is dependent on parameters such as learning rate and reward. We thoroughly analyse the performance of the proposed algorithm considering different parameters related to its operation. The convergence time, that is, the time required to reach a solution, decreases with increasing value of learning rate, whereas the convergence probability increases. In addition, for smaller values of learning rate, the convergence time decreases with increasing reward values. We also compare with simple ALOHA and channel‐based scheduled allocation and show that the proposed Q‐learning‐based technique has a higher probability of assigning slots compared with these techniques. Copyright © 2016 John Wiley & Sons, Ltd.