Abstract:In the malware detection process, obfuscated malicious codes cannot be efficiently and accurately detected solely in the dynamic or static feature space. Aiming at this problem, an integrative feature extraction algorithm based on simhash was proposed, which combines the static information e.g., API (Application Programming Interface) calls and dynamic information (such as file, registry and network behaviors) of malicious samples to form integrative features. The experiment extracts the integrative features of some static information and dynamic information, and then compares the classification, time and obfuscated-detection performance of the static, dynamic and integrated features, respectively, by using several common machine learning algorithms. The results show that the integrative features have better time performance than the static features, and better classification performance than the dynamic features, and almost the same obfuscated-detection performance as the dynamic features. This algorithm can provide some support for feature extraction of malware detection.
Temporal knowledge graphs (KGs) have recently attracted increasing attention. The temporal KG forecasting task, which plays a crucial role in such applications as event prediction, predicts future links based on historical facts. However, current studies pay scant attention to the following two aspects. First, the interpretability of current models is manifested in providing reasoning paths, which is an essential property of path-based models. However, the comparison of reasoning paths in these models is operated in a black-box fashion. Moreover, contemporary models utilize separate networks to evaluate paths at different hops. Although the network for each hop has the same architecture, each network achieves different parameters for better performance. Different parameters cause identical semantics to have different scores, so models cannot measure identical semantics at different hops equally. Inspired by the observation that reasoning based on multi-hop paths is akin to answering questions step by step, this paper designs an Interpretable Multi-Hop Reasoning (IMR) framework based on consistent basic models for temporal KG forecasting. IMR transforms reasoning based on path searching into stepwise question answering. In addition, IMR develops three indicators according to the characteristics of temporal KGs and reasoning paths: the question matching degree, answer completion level, and path confidence. IMR can uniformly integrate paths of different hops according to the same criteria; IMR can provide the reasoning paths similarly to other interpretable models and further explain the basis for path comparison. We instantiate the framework based on common embedding models such as TransE, RotatE, and ComplEx. While being more explainable, these instantiated models achieve state-of-the-art performance against previous models on four baseline datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.