Mobile app marketplaces are dominated by free apps that rely on advertising for their revenue. These apps place increased demands on the already limited battery lifetime of modern phones. For example, in the top 15 free Windows Phone apps, we found in-app advertising contributes to 65% of the app's total communication energy (or 23% of the app's total energy). Despite their small size, downloading ads each time an app is started and at regular refresh intervals forces the network radio to be continuously woken up, thus leading to a high energy overhead, so-called 'tail energy' problem. A straightforward mechanism to lower this overhead is to prefetch ads in bulk and serve them locally. However, the prefetching of ads is at odds with the real-time nature of modern advertising systems wherein ads are sold through real-time auctions each time the client can display an ad. This paper addresses the challenge of supporting ad prefetching with minimal changes to the existing advertising architecture. We build client models predicting how many ad slots are likely to be available in the future. Based on this (unreliable) estimate, ad servers make client ad slots available in the ad exchange auctions even before they can be displayed. In order to display the ads within a short deadline, ads are probabilistically replicated across clients, using an overbooking model designed to ensure that ads are shown before their deadline expires (SLA violation rate) and are shown no more than required (revenue loss). With traces of over 1,700 iPhone and Windows Phone users, we show that our approach can reduce the ad energy overhead by over 50% with a negligible revenue loss and SLA violation rate.