Abstract-In this paper we consider a set of Software as a Service (SaaS) providers, that offer a set of Web services using the Cloud facilities provided by an Infrastructure as a Service (IaaS) provider. We assume that the IaaS provider offers a pay only what you use scheme similar to the Amazon EC2 service, comprising flat, on demand, and spot virtual machine instances. We propose a two-stage provisioning scheme. In the first stage, the SaaS providers determine the number of required flat and on demand instances by means of standard optimization techniques. In the second stage, the SaaS providers compete by bidding for the spot instances which are instantiated using the unused IaaS capacity. We put our focus on the bidding decision process by the SaaS providers, which takes place during the second stage, and apply N-armed bandit problems, in which the player is faced repeatedly with a choice among N different options, and every time he submits his decision evaluating past feedbacks. Through numerical experiments, we analyze proposed strategies under different scenarios and prove the SaaS providers ability to refine their behavior round by round and to determine the best bid so to maximize their revenue and achieve as many spot resources as possible, also addressing the importance of a trade-off between exploration and exploitation, i.e., among greedy and non-greedy actions.