OBJECTIVE-The objective of this paper is to present the current evidence relative to the effectiveness of pair programming (PP) as a pedagogical tool in higher education CS/SE courses. METHOD-We performed a systematic literature review (SLR) of empirical studies that investigated factors affecting the effectiveness of PP for CS/SE students and studies that measured the effectiveness of PP for CS/SE students. RESULTS-Seventy four (74) papers were used in our synthesis of evidence, and 14 compatibility factors that can potentially affect PPʼs effectiveness as a pedagogical tool were identified. Results showed that studentsʼ skill level was the factor that affected PPʼs effectiveness the most. The most common measure used to gauge PPʼs effectiveness was time spent on programming. In addition, studentsʼ satisfaction when using PP was overall higher than when working solo. Our meta-analyses showed that PP was effective in improving studentsʼ grades on assignments. Finally, in the studies that used quality as a measure of effectiveness, the number of test cases succeeded, academic performance, and expert opinion were the quality measures mostly applied. CONCLUSIONS-The results of this SLR show two clear gaps in this research field: i) lack of studies focusing on pair compatibility factors aimed at making PP an effective pedagogical tool; ii) a lack of studies investigating PP for software design/modeling tasks in conjunction with programming tasks.
Technical debt is a metaphor to reflect the tradeoff software engineers make between short-term benefits and long-term stability. Self-admitted technical debt (SATD), a variant of technical debt, has been proposed to identify debt that is
intentionally introduced
during software development, e.g., temporary fixes and workarounds. Previous studies have leveraged human-summarized patterns (which represent n-gram phrases that can be used to identify SATD) or text-mining techniques to detect SATD in source code comments. However, several characteristics of SATD features in code comments, such as vocabulary diversity, project uniqueness, length, and semantic variations, pose a big challenge to the accuracy of pattern or traditional text-mining-based SATD detection, especially for cross-project deployment. Furthermore, although traditional text-mining-based method outperforms pattern-based method in prediction accuracy, the text features it uses are less intuitive than human-summarized patterns, which makes the prediction results hard to explain. To improve the accuracy of SATD prediction, especially for cross-project prediction, we propose a Convolutional Neural Network-- (CNN) based approach for classifying code comments as SATD or non-SATD. To improve the explainability of our model’s prediction results, we exploit the computational structure of CNNs to identify key phrases and patterns in code comments that are most relevant to SATD. We have conducted an extensive set of experiments with 62,566 code comments from 10 open-source projects and a user study with 150 comments of another three projects. Our evaluation confirms the effectiveness of different aspects of our approach and its superior performance, generalizability, adaptability, and explainability over current state-of-the-art traditional text-mining-based methods for SATD classification.
Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions.
CCS CONCEPTS• Software and its engineering → Software creation and management;
KEYWORDSSoftware engineering, software analytics, defect prediction ACM Reference Format:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.