Predicting Vulnerable Software Components via Text Mining

Scandariato, Riccardo; Walden, James; Hovsepyan, Aram; Joosen, Wouter

doi:10.1109/tse.2014.2340398

Cited by 281 publications

(195 citation statements)

References 34 publications

Supporting

Mentioning

191

Contrasting

Order By: Relevance

“…The results show that the approach had good precision and recall when used for prediction within a single project. Walden et al [28] confirmed that the vulnerability prediction technique based on text mining (described in [21]) could be more accurate than models based on software metrics. They have collected a dataset of PHP vulnerabilities for three open source web applications by mining the NVD and security announcements of those applications.…”

Section: Related Workmentioning

confidence: 99%

“…Massacci and Nguyen [14] provide a comprehensive survey and independent empirical validation of several vulnerability discovery models. Several other metrics have been used: code complexity metrics [25,24,16], developer activity metrics [24], static analysis defect densities [27], frequencies of occurrence of programming constructs [21,28], etc. We illustrate some representative cases in Table 2.…”

Section: Related Workmentioning

confidence: 99%

“…Scandriato et al [21] proposed to use a machine learning approach that mines source code of Android components and tracks the occurrences of specific patterns. The authors used the Fortify SCA tool: if the tool issues a warning about a file, this file is considered to be vulnerable.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

On the Security Cost of Using a Free and Open Source Component in a Proprietary Product

Dashevskyi

Brucker

Massacci

2016

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. The work presented in this paper is motivated by the need to estimate the security effort of consuming Free and Open Source Software (FOSS) components within a proprietary software supply chain of a large European software vendor. To this extent we have identified three different cost models: centralized (the company checks each component and propagates changes to the different product groups), distributed (each product group is in charge of evaluating and fixing its consumed FOSS components), and hybrid (only the least used components are checked individually by each development team). We investigated publicly available factors (e. g., development activity such as commits, code size, or fraction of code size in different programming languages) to identify which one has the major impact on the security effort of using a FOSS component in a larger software product.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

On the Security Cost of Using a Free and Open Source Component in a Proprietary Product

Dashevskyi

Brucker

Massacci

2016

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Although it is a relatively new area of research, a great number of VPMs has already been proposed in the related literature. As stated in [9], the main VPMs that can be found in the literature utilize software metrics [13][14][15][16][17][18][19][20][21][22], text mining [23][24][25][26][27][28], and security-related static analysis alerts [10,[29][30][31][32]] to predict vulnerabilities. These types of VPMs are analyzed in the rest of this section.…”

Section: Vulnerability Prediction Modelingmentioning

confidence: 99%

“…An empirical evaluation on 19 versions of a large-scale Android application, revealed that their technique may be promising for vulnerability prediction, as the produced predictors achieved sufficient precision (85% on average) and recall (87% on average). Based on these preliminary results, the same authors conducted a more elaborate empirical study to investigate the validity of their approach [25]. In particular, several VPMs using Naïve Bayes and Random Forest algorithms were constructed and evaluated on a code base of 20 large-scale Android applications.…”

Section: Vulnerability Prediction Modelingmentioning

confidence: 99%

Static Analysis-Based Approaches for Secure Software Development

Siavvas

Gelenbe

Kehagias

et al. 2018

Communications in Computer and Information Science

View full text Add to dashboard Cite

Abstract. Software security is a matter of major concern for software development enterprises that wish to deliver highly secure software products to their customers. Static analysis is considered one of the most effective mechanisms for adding security to software products. The multitude of static analysis tools that are available provide a large number of raw results that may contain security-relevant information, which may be useful for the production of secure software. Several mechanisms that can facilitate the production of both secure and reliable software applications have been proposed over the years. In this paper, two such mechanisms, particularly the vulnerability prediction models (VPMs) and the optimum checkpoint recommendation (OCR) mechanisms, are theoretically examined, while their potential improvement by using static analysis is also investigated. In particular, we review the most significant contributions regarding these mechanisms, identify their most important open issues, and propose directions for future research, emphasizing on the potential adoption of static analysis for addressing the identified open issues. Hence, this paper can act as a reference for researchers that wish to contribute in these subfields, in order to gain solid understanding of the existing solutions and their open issues that require further research.

show abstract

A performance evaluation of deep‐learnt features for software vulnerability detection

Ban

Liu

Chen

et al. 2018

Concurrency and Computation

View full text Add to dashboard Cite

Summary Software vulnerability is a critical issue in the realm of cyber security. In terms of techniques, machine learning (ML) has been successfully used in many real‐world problems such as software vulnerability detection, malware detection and function recognition, for high‐quality feature representation learning. In this paper, we propose a performance evaluation study on ML based solutions for software vulnerability detection, conducting three experiments: machine learning‐based techniques for software vulnerability detection based on the scenario of single type of vulnerability and multiple types of vulnerabilities per dataset; machine learning‐based techniques for cross‐project software vulnerability detection; and software vulnerability detection when facing the class imbalance problem with varying imbalance ratios. Experimental results show that it is possible to employ software vulnerability detection based on ML techniques. However, ML‐based techniques suffer poor performance on both cross‐project and class imbalance problem in software vulnerability detection.

show abstract

Predicting Vulnerable Software Components via Text Mining

Cited by 281 publications

References 34 publications

On the Security Cost of Using a Free and Open Source Component in a Proprietary Product

On the Security Cost of Using a Free and Open Source Component in a Proprietary Product

Static Analysis-Based Approaches for Secure Software Development

A performance evaluation of deep‐learnt features for software vulnerability detection

Contact Info

Product

Resources

About