An important aspect of developing models relating the number and type of faults in a software system to a set of structural measurement is defining what constitutes a fault. By definition, a fault is a structural imperfection in a software system that may lead to the system's eventually failing. A measurable and precise definition of what faults are makes it possible to accurately identify and count them, which in turn allows the formulation of models relating fault counts and types to other measurable attributes of a software system. Unfortunately, the most widely-used definitions are not measurable -there is no guarantee that two different individuals looking at the same set of failure reports and the same set of fault definitions will count the same number of underlying faults.
The incomplete and ambiguous nature of current fault definitions adds a noise component to the inputs used in modeling fault content. If this noise component is sufficiently large, any attempt to develop a fault model will produce invalid results.As part of our on-going work in modeling software faults, we have developed a method of unambiguously identifying and counting faults. Specifically, we base our recognition and enumeration of software faults on the grammar of the language of the software system. By tokenizing the differences between a version of the system exhibiting a particular failure behavior, and the version in which changes were made to eliminate that behavior, we are able to unambiguously count the number of faults associated with that failure. With modern configuration management tools, the identification and counting of software faults can be automated.
Problem reports at NASA are similar to bug reports: they capture defects found during test, post-launch operational anomalies, and document the investigation and corrective action of the issue. These artifacts are a rich source of lessons learned for NASA, but are expensive to analyze since problem reports are comprised primarily of natural language text. We apply topic modeling to a corpus of NASA problem reports to extract trends in testing and operational failures. We collected 16,669 problem reports from six NASA space flight missions and applied Latent Dirichlet Allocation topic modeling to the document corpus. We analyze the most popular topics within and across missions, and how popular topics changed over the lifetime of a mission. We find that hardware material and flight software issues are common during the integration and testing phase, while ground station software and equipment issues are more common during the operations phase. We identify a number of challenges in topic modeling for trend analysis: 1) that the process of selecting the topic modeling parameters lacks definitive guidance, 2) defining semantically-meaningful topic labels requires nontrivial effort and domain expertise, 3) topic models derived from the combined corpus of the six missions were biased toward the larger missions, and 4) topics must be semantically distinct as well as cohesive to be useful. Nonetheless, topic modeling can identify problem themes within missions and across mission lifetimes, providing useful feedback to engineers and project managers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.