2018
DOI: 10.1007/s10664-018-9630-9
|View full text |Cite
|
Sign up to set email alerts
|

Redundancy-free analysis of multi-revision software artifacts

Abstract: Researchers often analyze several revisions of a software project to obtain historical data about its evolution. For example, they statically analyze the source code and monitor the evolution of certain metrics over multiple revisions. The time and resource requirements for running these analyses often make it necessary to limit the number of analyzed revisions, e.g., by only selecting major revisions or by using a coarse-grained sampling strategy, which could remove significant details of the evolution. Most … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 17 publications
(21 citation statements)
references
References 80 publications
0
21
0
Order By: Relevance
“…Magic methods are naturally detected by looking for function definition nodes with the appropriate name. To make these detections, we utilized LISA, a framework for performing large-scale software analysis on abstract syntax trees [2]. We analyzed the most recent revision of all 1,000 projects, totalling 178,735 files containing 38,505,577 lines of Python code.…”
Section: Measuring the Prevalence Of Idioms Inmentioning
confidence: 99%
“…Magic methods are naturally detected by looking for function definition nodes with the appropriate name. To make these detections, we utilized LISA, a framework for performing large-scale software analysis on abstract syntax trees [2]. We analyzed the most recent revision of all 1,000 projects, totalling 178,735 files containing 38,505,577 lines of Python code.…”
Section: Measuring the Prevalence Of Idioms Inmentioning
confidence: 99%
“…Depending on the intended application and field of use, provenance can be looked at various granularities [4,13]. On the finest granularity end of the spectrum, tracking the origin of programming building blocks like functions, methods or classes, code snippets, or even individual lines of code (SLOC) and abstract syntax trees (AST) [4], is useful when studying coding patterns across repositories [5,17]. On the opposite end, at the coarsest granularity, tracking the origin of whole repositories is useful when looking at the evolution of forks [7,28,42,53] or project popularity [8].…”
Section: Software Provenance Trackingmentioning
confidence: 99%
“…Underlying our method for visualizing graph evolution is an idea that originally relates primarily to the efficient computation of metrics and other analyses over the entire history of large software projects, although the technique applies to any kind of versioned graph, as we discuss in section V-D. To this end, we present a graph compression algorithm and numerous techniques for reducing redundancies when analyzing multiple revisions of the same project in previous work [5]. A central concept in that work is the idea of a "revision range".…”
Section: Background and Research Goalmentioning
confidence: 99%
“…Consequently, each revision to be measured is analyzed individually in a laborious, resource-intensive, and slow process. Research shows that more than 95% of data is redundantly analyzed when discounting the multi-revision nature of software and that, all other factors being equal, analyzing revisions individually can be over 50 times slower [5]. Short of expensively parallelizing the workload, a fairly simple improvement is to analyze revisions incrementally, recomputing values only in artifacts where changes occur.…”
Section: Background and Research Goalmentioning
confidence: 99%
See 1 more Smart Citation