This paper presents a novel Q-learning based auction (QL-BA) algorithm for dynamic spectrum access in a one primary user multiple secondary users (OPMS) scenario. In the auction market, the secondary user provides a bidding price dynamically and intelligently using a Q-learning based bidding strategy to compete for current access opportunity; meanwhile primary user decides to whom to release the unused spectrum according to the maximal bidding principle. To obtain the limited and time-varying spectrum opportunities, each bidder presents a preference utility through Q-learning, considering the current packet transmission and future expectation. Simulation results show that the proposed QL-BA can significantly improve secondary users' bidding strategies and, hence, the performance in terms of packet loss, bidding efficiency and transmission rate is improved progressively.