Nowadays, it is very convenient to capture photos by a smart phone. As using, the smart phone is a convenient way to share what users experienced anytime and anywhere through social networks, it is very possible that we capture multiple photos to make sure the content is well photographed. In this paper, an effective scalable mobile image retrieval approach is proposed by exploring contextual salient information for the input query image. Our goal is to explore the high-level semantic information of an image by finding the contextual saliency from multiple relevant photos rather than solely using the input image. Thus, the proposed mobile image retrieval approach first determines the relevant photos according to visual similarity, then mines salient features by exploring contextual saliency from multiple relevant images, and finally determines contributions of salient features for scalable retrieval. Compared with the existing mobile-based image retrieval approaches, our approach requires less bandwidth and has better retrieval performance. We can carry out retrieval with <200-B data, which is <5% of existing approaches. Most importantly, when the bandwidth is limited, we can rank the transmitted features according to their contributions to retrieval. Experimental results show the effectiveness of the proposed approach.