2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) 2019
DOI: 10.1109/msr.2019.00079
|View full text |Cite
|
Sign up to set email alerts
|

SeSaMe: A Data Set of Semantically Similar Java Methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 28 publications
0
3
0
Order By: Relevance
“…There are several popular clone related datasets, which can be used for clone search. These datasets include BigCloneBench [36], [37], Project CodeNet (PCN) [38], SeSaMe [50], Pedagogical programming Open Judge (POJ-104) [51] and Google Code Jam(GCJ) [52]. BigCloneBench dataset contains references of clone methods belonging to different functionality types that exist in IJaDataset.…”
Section: B Clone Datasetsmentioning
confidence: 99%
“…There are several popular clone related datasets, which can be used for clone search. These datasets include BigCloneBench [36], [37], Project CodeNet (PCN) [38], SeSaMe [50], Pedagogical programming Open Judge (POJ-104) [51] and Google Code Jam(GCJ) [52]. BigCloneBench dataset contains references of clone methods belonging to different functionality types that exist in IJaDataset.…”
Section: B Clone Datasetsmentioning
confidence: 99%
“…≤ 25 lines), have nontrivial or branching logic with no syntactic errors, and rely on built-in features or standard libraries. To find candidates, we explored code clone benchmarks [18], [19], [20], [21], [22], code search engines, open-source platforms like GitHub, and practice websites like Codewars 1 and LeetCode. 2 We manually modified some snippets to control our variables but wrote none from scratch.…”
Section: B Artifact Collectionmentioning
confidence: 99%
“…Our study is carried out on two open-source datasets, SeSaMe [16] and GCJ [47]. SeSaMe is collected from 11 real-world software repositories, including JDK11 and eclipse.…”
Section: Data Collectionmentioning
confidence: 99%