Abstract. Traditionally, direct marketing companies have relied on pretesting to select the best offers to send to their audiences. Companies systematically dispatch the offers under consideration to a limited sample of potential buyers, rank them with respect to their performance and, based on this ranking, decide which offers to send to the wider population. Though this pre-testing process is simple and widely used, recently the direct marketing industry has been under increased pressure to further optimize learning, in particular when facing severe time and space constraints. Taking into account the multimedia nature of offers, which typically comprise both a visual and text component, we propose a two-phase learning strategy based on a cascade of regression methods. This proposed approach takes advantage of visual and text features to improve and accelerate the learning process. Experiments in the domain of a commercial Multimedia Messaging Service (MMS) show the effectiveness of the proposed methods that improve on classical learning techniques. The main contribution of the present work is to demonstrate that direct marketing firms can exploit the information on visual content to optimize the learning phase. The proposed methods can be used in any multimedia direct marketing domains in which offers are composed by image and text.