: Nowadays, with the booming of multimedia data, the character of multi-source and multi-modality of data has become a challenging problem in multimedia research. Its representation and generation can be as two key factors in crossmodal learning research. Cross-modal representation studies feature learning and information integration methods using