Real time bidding is one of the most popular ways of selling impressions in online advertising, where online ad publishers allocate some blocks in their websites to sell in online auctions. In real time bidding, ad networks connect publishers and advertisers. There are many available ad networks for publishers to choose from. A possible approach for selecting ad networks and sending ad requests is called Waterfall Strategy, in which ad networks are selected sequentially. The ordering of the ad networks is very important for publishers, and finding the ordering that will provide maximum revenue is a hard problem due to the highly dynamic environment. In this paper, we propose a dynamic ad network ordering method to find the best ordering of ad networks for publishers that opt for Waterfall Strategy to select ad networks. This method consists of two steps. The first step is a prediction model that is trained on real time bidding historical data and provides an estimation of revenue for each impression. These estimations are used as initial values for the Q-table in the second step. The second step is based on Reinforcement Learning and improves the output of the prediction model. By calculating the revenue of our method and comparing that with the revenue of a fixed and predefined ordering method, we show that our proposed dynamic ad network ordering method increases publishers’ revenue.