1+1&gt;2: Integrating Deep Code Behaviors with Metadata Features for Malicious PyPI Package Detection

Sun, Xiaobing; Gao, Xingan; Cao, Sicong; Bo, Lili; Wu, Xiaoxue; Huang, Kaifeng

doi:10.1145/3691620.3695493

Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering 2024

DOI: 10.1145/3691620.3695493

|View full text |Cite

1+1>2: Integrating Deep Code Behaviors with Metadata Features for Malicious PyPI Package Detection

Xiaobing Sun,

Xingan Gao,

Sicong Cao

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence

Zhang,

Huang,

Huang

et al. 2024

ACM Trans. Softw. Eng. Methodol.

View full text Add to dashboard Cite

Open-source software (OSS) supply chain enlarges the attack surface of a software system, which makes package registries attractive targets for attacks. Recently, multiple package registries have received intensified attacks with malicious packages. Of those package registries, NPM and PyPI are two of the most severe victims. Existing malicious package detectors are developed with features from a list of packages of the same ecosystem and deployed within the same ecosystem exclusively, which is infeasible to utilize the knowledge of a new malicious NPM package detected recently to detect the new malicious package in PyPI. Moreover, existing detectors lack support to model malicious behavior of OSS packages in a sequential way To address the two limitations, we propose a single detection model using malicious behavior sequence, named Cerebro , to detect malicious packages in NPM and PyPI. We curate a feature set based on a high-level abstraction of malicious behavior to enable multi-lingual knowledge fusing. We organize extracted features into a behavior sequence to model sequential malicious behavior. We fine-tune the pre-trained language model to understand the semantics of malicious behavior. Extensive evaluation has demonstrated the effectiveness of Cerebro over the state-of-the-art as well as the practically acceptable efficiency. Cerebro has detected 683 and 799 new malicious packages in PyPI and NPM, and received 707 thank letters from the official PyPI and NPM teams.

show abstract

Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence

Zhang,

Huang,

Huang

et al. 2024

ACM Trans. Softw. Eng. Methodol.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

1+1>2: Integrating Deep Code Behaviors with Metadata Features for Malicious PyPI Package Detection

Cited by 1 publication

References 24 publications

Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence

Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence

Contact Info

Product

Resources

About