Measuring Similarity of Large Software Systems Based on Source Code Correspondence

Yamamoto, Takakazu; Matsushita, Masayuki; Kobayashi, Toshihiro; Inoue, Katsuro

doi:10.1007/11497455_41

Cited by 37 publications

(25 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Three approaches similar to ours are [19,14,22]. The first one detects function clones by comparing a set of metrics for each combination of functions and then categorizes the results on an ordinal scale.…”

Section: Related Workmentioning

confidence: 99%

Supporting the Grow-and-Prune Model in Software Product Lines Evolution Using Clone Detection

Mende¹,

Beckwermert²,

Koschke³

et al. 2008

2008 12th European Conference on Software Maintenance and Reengineering

View full text Add to dashboard Cite

Software Product Lines (SPL) can be used to create and maintain different variants of software-intensive systems by explicitly managing variability. Often, SPLs are organized as an SPL core, common to all products, upon which product-specific components are built. Following the so called grow-and-prune model, SPLs may be evolved by copy&paste at large scale. New products are created from existing ones and existing products are enhanced with functionalities specific to other products by copying and pasting code between product-specific code. To regain control of this unmanaged growth, such code may be pruned, that is, identified and refactored into core components upon success. This paper describes tool support for the grow-andprune model in the evolution of software product lines by identifying similar functions which can be moved to the core. These functions are identified in two steps. First, token-based clone detection is used to detect pairs of functions sharing code. Second, Levenshtein distance measures the textual similarity among these functions. Sufficient similarity at function level is then lifted to the architectural level.The approach is evaluated by three case studies, one using an open source email client to simulate the initial creation of an SPL, and two monitoring existing industrial product lines from the embedded domain.

show abstract

Section: Related Workmentioning

confidence: 99%

Supporting the Grow-and-Prune Model in Software Product Lines Evolution Using Clone Detection

Mende¹,

Beckwermert²,

Koschke³

et al. 2008

2008 12th European Conference on Software Maintenance and Reengineering

View full text Add to dashboard Cite

show abstract

“…Yamamoto et al proposed SMAT tool that calculates similarity of software systems by counting similar lines of source code [17]. They identify corresponding source files between two software systems using CCFinder [10], and then compute differences between file pairs.…”

Section: Software Evolutionmentioning

confidence: 99%

Approximating the Evolution History of Software from Source Code

Kanda

Ishio

Inoue

2015

IEICE Trans. Inf. & Syst.

Self Cite

View full text Add to dashboard Cite

“…The first approach performs cluster analysis for the sets of source code [6]. This is based on the similarity of two sets of source code, which is defined as the ratio of the numbers of similar code lines to that of the overall lines of two software systems.…”

Section: Automatic Categorizationmentioning

confidence: 99%

Mega Software Engineering

Inoue

Garg²,

Iida

et al. 2005

Product Focused Software Process Improvement

Self Cite

View full text Add to dashboard Cite

Abstract. In various fields of computer science, rapidly growing hardware power, such as high-speed network, high-performance CPU, huge disk capacity, and large memory space, has been fruitfully harnessed. Examples of such usage are large scale data and web mining, grid computing, and multimedia environments. We propose that such rich hardware can also catapult software engineering to the next level. Huge amounts of software engineering data can be systematically collected and organized from tens of thousands of projects inside organizations, or from outside an organization through the Internet. The collected data can be analyzed extensively to extract and correlate multi-project knowledge for improving organization-wide productivity and quality. We call such an approach for software engineering Mega Software Engineering. In this paper, we propose the concept of Mega Software Engineering, and demonstrate some novel data analysis characteristic of Mega Software Engineering. We describe a framework for enabling Mega Software Engineering.

show abstract

Measuring Similarity of Large Software Systems Based on Source Code Correspondence

Cited by 37 publications

References 29 publications

Supporting the Grow-and-Prune Model in Software Product Lines Evolution Using Clone Detection

Supporting the Grow-and-Prune Model in Software Product Lines Evolution Using Clone Detection

Approximating the Evolution History of Software from Source Code

Mega Software Engineering

Contact Info

Product

Resources

About