2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) 2016
DOI: 10.1109/icsme.2016.21
|View full text |Cite
|
Sign up to set email alerts
|

Using Temporal and Semantic Developer-Level Information to Predict Maintenance Activity Profiles

Abstract: Abstract-Predictive models for software projects' characteristics have been traditionally based on project-level metrics, employing only little developer-level information, or none at all. In this work we suggest novel metrics that capture temporal and semantic developer-level information collected on a per developer basis. To address the scalability challenges involved in computing these metrics for each and every developer for a large number of source code repositories, we have built a designated repository … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
31
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 15 publications
(31 citation statements)
references
References 30 publications
0
31
0
Order By: Relevance
“…Its Kappa on the other hand, would be 0, making this model much less appealing. (3) Our previous work [5] shows that source code change types as defined by Fluri et al [11] are statistically significant in the context of maintenance activity categories defined by Mockus et al [1]. We believe that boosting (i.e.…”
Section: Introductionmentioning
confidence: 90%
See 2 more Smart Citations
“…Its Kappa on the other hand, would be 0, making this model much less appealing. (3) Our previous work [5] shows that source code change types as defined by Fluri et al [11] are statistically significant in the context of maintenance activity categories defined by Mockus et al [1]. We believe that boosting (i.e.…”
Section: Introductionmentioning
confidence: 90%
“…First we classified the test dataset (the 15% of the entire labeled dataset) using a naive method to set an initial baseline. The naive method is based solely on searching for pre-defined words gathered from previous work [5], and returning the most frequent class (i.e., corrective) in case none of the keywords were present in a commit's message, see table 2 for more details. The results showed that 34.8% of the commits in the test dataset (60 commits) did not have any of the keywords present in their commit message, and were therefore automatically classified corrective.…”
Section: Utilizing Word Frequency Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…In the course of our studies [6][7][8], the data processing stage typically included the following aggregations: commit level; developer level; project level; global statistics. The analytical layer we present allows researchers to produce commit level aggregations (see Listing 5) and obtain statistics such as: change type frequencies, number of test case (test method) addition/removal/modification, number of test suite (test class) addition/removal/modification, associated ticket id, number of test files, and non test files in a given commit.…”
Section: Obtaining Fine Grained Source Code Changesmentioning
confidence: 99%
“…To effectively process large datasets, our analytical layer leverages Apache Spark [12] (henceforth Spark), a widely popular distributed computation engine. The analytical layer we suggest has been successfully used to conduct a number of studies in the field of software maintenance and evolution [6][7][8]. This leads us to believe it can be useful for researchers conducting studies that involve fine-grained source code changes.…”
Section: Introductionmentioning
confidence: 99%