Code search is an important approach to improve effectiveness and efficiency of software development. The current studies commonly search target code based on either semantic or statistical information in large datasets. Semantic and statistical information have hidden relationships between them since they describe code snippets from different perspectives. In this work, we propose a joint embedding model of semantic and statistical features to improve the effectiveness of code annotation. Then, we implement a code search engine, i.e., JessCS, based on the joint embedding model. We evaluate JessCS on more than 1 million lines of code snippets and corresponding descriptions. The experimental results show that JessCS performs more effective than UNIF-based approach, with at least 13% improvements on the studied metrics.
Code search is a process that takes a given query as input and retrieves relevant code snippets from a code base. The relationship between query and code is commonly built on code annotation, which is extracted from code comments or other documents. The current code search studies approximately treat code annotation as a common natural language, regardless of its hidden structural information. To address the information loss, this work proposes a code annotation model to extract features from five perspectives, and further conduct a code search engine, i.e., CodeHunter. CodeHunter is evaluated on a dataset of 7 million code snippets and query descriptions. The experimental results show that CodeHunter obtains more effective results than Lucene and DeepCS. And we also prove that the effectiveness comes from the rich features and search models, CodeHunter can work well with different sizes of query descriptions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.