Code comments convey information about the programmers' intention in a more explicit but less rigorous manner than source code. This information can assist programmers in various tasks, such as code comprehension, reuse, and maintenance. To better understand the properties of the comments existing in the source code, we analyzed more than 450 000 comments across 136 popular open-source software systems coming different domains. We found that the methods involving header comments and internal comments were shown low percentages in software systems, ie, 4.4% and 10.27%, respectively. As an application of our findings, we propose an automatic approach to determine whether a method needs a header comment, known as commenting necessity identification. Specifically, we identify the important factors for determining the commenting necessity of a method and extract them as structural features, syntactic features, and textual features. Then, by applying machine learning techniques and noise-handling techniques, we achieve a precision of 88.5% on eight open-source software from GitHub. The encouraging experimental results demonstrate the feasibility and effectiveness of our approach.
Code comments are one of the important documents to help developers review and comprehend source code. In recent studies, researchers have proposed many deep learning models to generate the method header comments (i.e., method comment), which have achieved encouraging results. The comments in the method, which is called inline comment, are also important for program comprehension. Unfortunately, they have not received enough attention in automatic generation when comparing with the method comments. In this paper, we compare and analyze the similarities and differences between the method comments and the inline comments. By applying the existing models of generating method comments to the inline comment generation, we find that these existing models perform worse on the task of inline comment generation. We then further explore the possible reasons and obtain a number of new observations. For example, we find that there are a lot of templates (i.e., comments with same or similar structures) in the method comment dataset, which makes the models perform better. Some terms were thought to be important (e.g., API calls) in the comment generation by previous study does not significantly affect the quality of the generated comments, which seems counter-intuitive. Our findings may give some implications for building the approaches of method comment or inline comment generation in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.