2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER) 2020
DOI: 10.1109/saner48275.2020.9054840
|View full text |Cite
|
Sign up to set email alerts
|

Are the Code Snippets What We Are Searching for? A Benchmark and an Empirical Study on Code Search with Natural-Language Queries

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 50 publications
(25 citation statements)
references
References 37 publications
0
25
0
Order By: Relevance
“…Next, we trace them in the IJaDataset files, by following their references from the BigCloneBench dataset, and put them in our search corpus list (Tracing). Afterwards, we normalize each clone We do not use comments as they have been reported to be non-reliable and inconsistent source for extracting natural language document [54], [55]. Similarly, software projects can be poorly documented.…”
Section: B Identifier Extractionmentioning
confidence: 99%
“…Next, we trace them in the IJaDataset files, by following their references from the BigCloneBench dataset, and put them in our search corpus list (Tracing). Afterwards, we normalize each clone We do not use comments as they have been reported to be non-reliable and inconsistent source for extracting natural language document [54], [55]. Similarly, software projects can be poorly documented.…”
Section: B Identifier Extractionmentioning
confidence: 99%
“…Therefore, performance measures such as recall are not of major concern, as they are used to identify whether information retrieval systems miss in reporting some relevant result or not. Previously, there were also many researchers, who did not report recall because of the similar nature of the problem ( Keivanloo, Rilling & Zou, 2014 ; Lv et al, 2015 ; Gu, Zhang & Kim, 2018 ; Yan et al, 2020 ). Based on these reasons, we choose MRR, top-k accuracy and precision metrics to determine the performance of our information retrieval system.…”
Section: Empirical Evaluationmentioning
confidence: 99%
“…Lucene is a popular search library for the development of various information retrieval solutions because of its scalability, high-performance and efficient search algorithms [42]. It is shown to answer the highest number of queries as compared to other code search approaches [43].…”
Section: Code Search Systemsmentioning
confidence: 99%