PathMiner: A Library for Mining of Path-Based Representations of Code

Kovalenko, Vladimir; Bogomolov, Egor; Bryksin, Timofey; Bacchelli, Alberto

doi:10.1109/msr.2019.00013

Cited by 34 publications

(27 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several studies [2,3,23,35] account for structural information but differ from our work. Hu et al [23] proposed an approach to use Sequence-to-Sequence Neural Machine Translation to generate method-level code comments.…”

Section: Related Workmentioning

confidence: 73%

CC2Vec

Hoang

Kang

et al. 2020

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

145

View full text Add to dashboard Cite

Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec, a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes. CC2Vec models the hierarchical structure of a code change with the help of the attention mechanism and uses multiple comparison functions to identify the differences between the removed and added code. To evaluate if CC2Vec can produce a distributed representation of code changes that is general and useful for multiple tasks on software patches, we use the vectors produced by CC2Vec for three tasks: log message generation, bug fixing patch identification, and just-in-time defect prediction. In all tasks, the models using CC2Vec outperform the state-of-the-art techniques.

show abstract

Section: Related Workmentioning

confidence: 73%

CC2Vec

Hoang

Kang

et al. 2020

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

145

View full text Add to dashboard Cite

show abstract

“…We design this workflow as follows: First, we collected the dataset including suspect file pairs and related reasons. Then, we employ state-of-the-art code embedding techniques such as PathMiner [86] to extract features for a file pair. Finally, we use classification techniques like CNN and RNN to implement automatic classification of reasons for suspect file pairs.…”

Section: A the Rationale Of Our Methodsmentioning

confidence: 99%

Early Detection of Flawed Structural Dependencies During Software Evolution

Cui

2021

IEEE Access

View full text Add to dashboard Cite

During software evolution, complex structural dependencies between source files pose a great challenge on maintenance activities. Some of these dependencies propagate defects among files, incurring frequent bugs or changes, and consuming significant maintenance costs. They can be referred to as flawed structural dependencies. In this paper, we proposed a method to identify these potential problematic dependencies at an early stage during software evolution, by combing structural and semantic dependencies, so that developers can save maintenance costs by fixing these issues in time. Our method works as follows: First, we extract structural dependencies from the source code syntax and semantic dependencies from the source code lexicon. Second, we collect suspect file pairs by calculating the difference between structural and semantic dependencies. Next, we exhaustively examine each source file in the system and locate the interaction of its impacted subordinated files and suspect file pairs (SFP) as suspect dependencies. Finally, we gather all the suspect dependencies as flawed structural dependencies candidates. We evaluate our method using 838 releases of 15 open source projects, including 33353 bug reports and 86690 revision commits. The detection result shows that our identified dependencies use 14% of all the files to capture almost 70% of top 10% bug-prone files or change-prone files with enough high precision: 92%. Moreover, our identified dependencies also incur 957% of bug frequencies and 1050% of change frequencies than average in future versions. In summary, our method can effectively and efficiently detect flawed structural dependencies in time during software evolution.

show abstract

“…To make performance comparison with the code2vec for vulnerability prediction, we used two open source implementations called code2vec 4 and astminer 5 [38]. The former requires the latter to extract path context of the C codes considered in our own work, as the publicised implementation of the code2vec currently supports only Java and C# as the input languages.…”

Section: ) Comparison With the Code2vecmentioning

confidence: 99%

Vulnerability Prediction From Source Code Using Machine Learning

Bilgin¹,

Ersoy²,

Soykan³

et al. 2020

IEEE Access

View full text Add to dashboard Cite

As the role of information and communication technologies gradually increases in our lives, software security becomes a major issue to provide protection against malicious attempts and to avoid ending up with noncompensable damages to the system. With the advent of data-driven techniques, there is now a growing interest in how to leverage machine learning (ML) as a software assurance method to build trustworthy software systems. In this study, we examine how to predict software vulnerabilities from source code by employing ML prior to their release. To this end, we develop a source code representation method that enables us to perform intelligent analysis on the Abstract Syntax Tree (AST) form of source code and then investigate whether ML can distinguish vulnerable and nonvulnerable code fragments. To make a comprehensive performance evaluation, we use a public dataset that contains a large amount of function-level real source code parts mined from open-source projects and carefully labeled according to the type of vulnerability if they have any. We show the effectiveness of our proposed method for vulnerability prediction from source code by carrying out exhaustive and realistic experiments under different regimes in comparison with state-of-art methods.

show abstract

PathMiner: A Library for Mining of Path-Based Representations of Code

Cited by 34 publications

References 13 publications

CC2Vec

CC2Vec

Early Detection of Flawed Structural Dependencies During Software Evolution

Vulnerability Prediction From Source Code Using Machine Learning

Contact Info

Product

Resources

About