Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering 2018
DOI: 10.1145/3238147.3238169
|View full text |Cite
|
Sign up to set email alerts
|

Mining file histories: should we consider branches?

Abstract: Modern distributed version control systems, such as Git, offer support for branching-the possibility to develop parts of software outside the master trunk. Consideration of the repository structure in Mining Software Repository (MSR) studies requires a thorough approach to mining, but there is no well-documented, widespread methodology regarding the handling of merge commits and branches. Moreover, there is still a lack of knowledge of the extent to which considering branches during MSR studies impacts the res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
27
1
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(30 citation statements)
references
References 51 publications
0
27
1
2
Order By: Relevance
“…At the same time as an unintended side effect of decentralization, Git can also create implicit branches as illustrated in Fig. 2 [16,46]. Suppose that a developer pulled the latest snapshot from the remote repository, added several commits to the local repository (repo #1), and pushed them back.…”
Section: Git Metadatamentioning
confidence: 99%
“…At the same time as an unintended side effect of decentralization, Git can also create implicit branches as illustrated in Fig. 2 [16,46]. Suppose that a developer pulled the latest snapshot from the remote repository, added several commits to the local repository (repo #1), and pushed them back.…”
Section: Git Metadatamentioning
confidence: 99%
“…However, researchers observed that Git, the de facto VCS platform, is not mining-friendly [6], [23], [30]. Indeed, Git was not designed for an accurate retrieval of history changes [32]. For this purpose, many studies highlighted Git pitfalls and urged the MSR community to address them [6], [23], [31].…”
Section: Related Workmentioning
confidence: 99%
“…For this purpose, many studies highlighted Git pitfalls and urged the MSR community to address them [6], [23], [31]. In particular, a recent study of Kovalenko et al [32] showed that handling branches and renamings is crucial for an accurate tracking of contributions in Git. Our proposed toolkit, SNIFFER, complies with these guidelines and proposes a renaming and branch aware analysis of the change history.…”
Section: Related Workmentioning
confidence: 99%
“…At the technical level, we perform parsing to extract method content; we track simple refactorings of files and methods to prevent wrong changes according to [5]; we consider branches to correctly capture changes that are merged into the master [42]; and we determine characteristics of SPARQL and Cypher repositories by performing a linear regression analysis to generalize on the population that we sampled from.…”
Section: Commit Historiesmentioning
confidence: 99%
“…In [9,37,38], the promises and perils of mining GitHub's open-source data for empirical analysis is discussed. The effects of branching in the repository history on empirical studies is discussed in [42]. We treat branching by a dedicated method.…”
Section: Related Workmentioning
confidence: 99%