Identifying possible threat actors from samples of malware remains an active area of research with important ramifications for cybersecruity practitioners. The unsupervised identification and characterization of malware samples has been primarily treated as an early integration, multi-modal clustering problem where all possible features derived from the samples are concatenated into one feature vector, which can then be fed into a standard unsupervised learning algorithm. In this work, we focus on characterizing malware samples into possible threat actors from both late integration and intermediate integration multi-modal data perspectives. In doing so, we propose a new algorithm, Cross-Modal Influence Clustering, for characterizing malware samples into threat actor groups. We test our proposed method along with several other multimodal clustering techniques on heterogeneous malware samples from three different threat actors. Our results indicate that our proposed method is the best method for clustering malware samples into threat actors in an unsupervised setting and that we observe consistently better results in characterizing malware samples by threat actors from intermediate or late integration multi-modal paradigms rather than an early integration one.INDEX TERMS Cybersecurity, malware, multi-modal data, network diffusion, unsupervised machine learning, multi-view data.