2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) 2017
DOI: 10.1109/msr.2017.28
|View full text |Cite
|
Sign up to set email alerts
|

SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering

Abstract: Version information plays an important role in spreadsheet understanding, maintaining and quality improving. However, end users rarely use version control tools to document spreadsheets' version information. Thus, the spreadsheets' version information is missing, and different versions of a spreadsheet coexist as individual and similar spreadsheets. Existing approaches try to recover spreadsheet version information through clustering these similar spreadsheets based on spreadsheet filenames or related email co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 32 publications
0
3
0
Order By: Relevance
“…al. proposed SpreadCluster, a different approach for recovering spreadsheet version information [23]. Instead of clustering spreadsheets based on their filenames, they use features of the spreadsheet, like table headers and worksheet names.…”
Section: B Spreadsheet Evolutionmentioning
confidence: 99%
“…al. proposed SpreadCluster, a different approach for recovering spreadsheet version information [23]. Instead of clustering spreadsheets based on their filenames, they use features of the spreadsheet, like table headers and worksheet names.…”
Section: B Spreadsheet Evolutionmentioning
confidence: 99%
“…That implies converting data blocks into a single, contiguous, and uninterrupted tabular structure with acceptable column headers. Automatizing this conversion process is dependent upon ongoing research [17], [18], [19], [20], and is not within the scope of this paper.…”
Section: A Automatically Inferring Spreadsheet Invariantsmentioning
confidence: 99%
“…Machine learning and deep learning have become the hottest AI learning nowadays [2]. Clustering algorithm is widely used in machine learning [3][4][5]. Traditional clustering algorithms have to set the weight for different attributes and then fit it, in order to get more accurate conclusions and deeper characteristics of the data.…”
Section: Introductionmentioning
confidence: 99%