Kuhn, followed by Tsuchiya and Kikuno, have developed a hierarchy of relationships among several common types of faults (such as variable and expression faults) for specification-based testing by studying the corresponding fault detection conditions. Their analytical results can help explain the relative effectiveness of various fault-based testing techniques previously proposed in the literature. This article extends and complements their studies by analyzing the relationships between variable and literal faults, and among literal, operator, term, and expression faults. Our analysis is more comprehensive and produces a richer set of findings that interpret previous empirical results, can be applied to the design and evaluation of test methods, and inform the way that test cases should be prioritized for earlier detection of faults. Although this work originated from the detection of faults related to specifications, our results are equally applicable to program-based predicate testing that involves logic expressions.
Automated program assessment systems have been widely adopted in many universities. Many of these systems judge the correctness of student programs by comparing their actual outputs with predefined expected outputs for selected test inputs. A common weakness of such systems is that student programs would be marked as incorrect as long as their outputs deviate from the predefined ones, even if the deviations are only minor, insignificant, and considered acceptable by a human assessor that the programs have satisfied the specifications. This critical weakness caused undue frustration to students and undesirable pedagogical consequences that undermine these systems’ benefits. To address this issue, we developed an improved mechanism for program output comparison to serve as a versatile test oracle that brings the results of automated assessment much closer to those of human assessors. We evaluated the new mechanism in real programming classes using an existing automated program assessment system. We found that the new mechanism achieved zero false‐positive error (did not wrongly accept any incorrect output) and very low (0%–0.02%) false‐negative error (that wrongly rejected correct outputs), with very high accuracy (99.8%–100%) in correctly recognizing outputs deemed acceptable by instructors. This represents a major improvement over an existing assessment mechanism, which had 56.4%–64.1% false‐negative error with an accuracy of 25.4%–40.9%. Moreover, about 67%–96% of students achieved their best results in their first attempt, which could be encouraging to them and reduce their frustration. Furthermore, students generally welcomed the new assessment mechanism and agreed it was beneficial to their learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.