Abstract-Due to the prevalence of "We-Media", everybody quickly publishes and receives information in various forms anywhere and anytime through the Internet. The rich crossmedia information carried by the multi-modal data in multiple media has a wide audience, deeply reflects the social realities and brings about much greater social impact than any single media information. Therefore, automatically detecting topics from crossmedia is of great benefit for the organizations (i.e., advertising agencies, governments) that care about the social opinions. However, cross-media topic detection is challenging from following aspects: 1) the multi-modal data from different media often involve distinct characteristics; 2) topics are presented in an arbitrary manner among the noisy web data. In this paper, we propose a multi-modality fusion framework and a topic recovery approach to effectively detect topics from cross-media data. The multi-modality fusion framework flexibly incorporates the heterogeneous multi-modal data into a Multi-Modality Graph (MMG), which takes full advantage from the rich cross-media information to effectively detect topic candidates. The topic recovery (TR) approach solidly improves the entirety and purity of detected topics by: 1) merging the topic candidates that are highly relevant themes of the same real topic; 2) filtering out the less-relevant noise data in the merged topic candidates. Extensive experiments on both single-media and cross-media data sets demonstrate the promising flexibility and effectiveness of our method in detecting topics from cross media.