Aiding comprehension of cloning through categorization

Kapser, Cory; Godfrey, Michael W.

doi:10.1109/iwpse.2004.1334772

Cited by 39 publications

(44 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They found that the percentage of cloned code did not change during software evolution and that, while new clones were added, some were factored out. Kasper and Godfrey [27] proposed a classification of clones based on their distance, i.e., within the same function, or file, or directory, and based on their granularity, i.e., block, function, or file. They used such a classification on clones detected in the Linux Kernel.…”

Section: Empirical Studies On the Presence And Evolution Of Clonesmentioning

confidence: 99%

An empirical study on the maintenance of source code clones

et al. 2009

View full text Add to dashboard Cite

Code cloning has been very often indicated as a bad software development practice. However, many studies appearing in the literature indicate that this is not always the case. In fact, either changes occurring in cloned code are consistently propagated, or cloning is used as a sort of templating strategy, where cloned source code fragments evolve independently. This paper (i) proposes an automatic approach to classify the evolution of source code clone fragments, and (ii) reports a fine-grained analysis of clone evolution in four different Java and C software systems, aimed at investigating to what extent clones are consistently propagated or they evolve independently. Also, the paper investigates the relationship between the presence of clone evolution patterns and other characteristics such as clone raSuresh Thummalapenta North Carolina State University, Raleigh, USA E-mail: sthumma@ncsu.edu Luigi Cerulo, Lerina Aversano, Massimiliano Di Penta Department of Engineering -University of Sannio, Benevento, Italy E-mail: lcerulo@unisannio.it, aversano@unisannio.it, dipenta@unisannio.it 2 dius, clone size and the kind of change the clones underwent, i.e., corrective maintenance or enhancement.

show abstract

Section: Empirical Studies On the Presence And Evolution Of Clonesmentioning

confidence: 99%

An empirical study on the maintenance of source code clones

et al. 2009

View full text Add to dashboard Cite

show abstract

“…It is suspected that approximately 10%-15% of many large systems is part of duplicated code [2,12,24,25], and it has been documented to exist at rates of over 50% of the effective lines of code (ELOC) in a particular COBOL system [12]. The literature on the topic has described many situations that can lead to the duplication of code within a software system [2,7,21,22,27,28].…”

Section: Code Cloningmentioning

confidence: 99%

“…During our investigations of cloning in large software systems [23,24,25], we found several recurring patterns of cloning, or rather ways in which developers duplicated behavior. These patterns are defined by what is duplicated and why, and to some extent how the duplication is done.…”

Section: Patterns Of Cloningmentioning

confidence: 99%

“…Often referred to as code clones, these segments of code typically involve 10-15% of the source code [24,25]. Code clones can arise through a number of different activities.…”

Section: Introductionmentioning

confidence: 99%

“…Thus, a variety of concerns such as stability, code ownership, and design clarity need to be considered before any refactoring is attempted; a manager should try to understand the reason behind the duplication before deciding what action (if any) to take. 1 This paper introduces eight cloning patterns that we have uncovered during case studies on large software systems, some of which we reported in [23,24,25]. These patterns present both good and bad motivations for cloning, and we discuss both the advantages and disadvantages of these patterns of cloning in terms of development and maintenance.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

"Cloning Considered Harmful" Considered Harmful

Kapser

Godfrey

2006

2006 13th Working Conference on Reverse Engineering

185

125

View full text Add to dashboard Cite

show abstract

Supporting the analysis of clones in software systems

Kapser

Godfrey

2006

J. Softw. Maint. Evol.: Res. Pract.

View full text Add to dashboard Cite

Code duplication is a well-documented problem in industrial software systems. There has been considerable research into techniques for detecting duplication in software, and there are several effective tools to perform this task. However, there have been few detailed qualitative studies into how cloning actually manifests itself within software systems. This is primarily due to the large result sets that many clonedetection tools return; these result sets are very difficult to manage without complementary tool support that can scale to the size of the problem, and this kind of support does not currently exist. In this paper we present an in-depth case study of cloning in a large software system that is in wide use, the Apache Web server; we provide insights into cloning as it exists in this system, and we demonstrate techniques to manage and make effective use of the large result sets of clone-detection tools. In our case study, we found several interesting types of cloning occurrences, such as 'cloning hotspots', where a single subsystem comprising only 17% of the system code contained 38.8% of the clones. We also found several examples of cloning behavior that were beneficial to the development of the system, in particular cloning as a way to add experimental functionality.(1) facilities to evaluate the overall cloning situation;(2) mechanisms to guide users toward clones that are most relevant to their task; and (3) methods for filtering and refining the analysis of the clones.Each of these criteria is described in more detail below. Overall system evaluationAs a first step in understanding cloning within a software system, regardless of the end goal, maintainers must have an understanding of the cloning from a high level of abstraction. This understanding will allow the user to evaluate the extent and the severity of the duplication in order to estimate the cost and/or necessity of the task.Several mechanisms can be used to evaluate cloning from a high level. Visualization methods, such as scatterplots [1,3,4,12,15], are useful for the discovery of highly related subsystems and high levels of cloning within a subsystem. They are also useful for detecting unusual types of cloning, such as cloning from system libraries to other parts of the software system. Metric-oriented reports, such as reporting the percentage of lines cloned, average length of a clone, etc., are useful for directing users to points in the system where the most cloning is occurring, or where cloning activities are unusually high in relation to subsystem size. Guide and empower the userThe possibly large sets of clones returned by the clone-detection methods make it infeasible to look at each individual clone. There are several ways to direct users toward the clones they seek. Metrics can be used to query the dataset [16]. Some examples of metrics that might be used are the size of the clone, the types of changes made to the clone, and types of external dependencies a code segment has. Such a method can direct users to promising refactoring ...

show abstract

Aiding comprehension of cloning through categorization

Cited by 39 publications

References 17 publications

An empirical study on the maintenance of source code clones

An empirical study on the maintenance of source code clones

"Cloning Considered Harmful" Considered Harmful

Supporting the analysis of clones in software systems

Contact Info

Product

Resources

About