“…The main difficulty in cross-media retrieval is to define a similarity measure among heterogeneous low-level features. In order to simultaneously search and retrieve data from multiple modalities, other approaches have been considered [10,11,12,13,14,15,16,17,18,19,20]. For instance in [16], it is experimentally shown that multimodal queries achieve higher retrieval accuracy than mono-modal ones.…”