Combined sewer overflows represent significant risks to human health as untreated water is discharged to the environment. Municipalities, such as the Metropolitan Sewer District of Greater Cincinnati (MSDGC), recently began collecting large amounts of water-related data and considering the adoption of deep learning (DL) solutions like recurrent neural network (RNN) for predicting overflow events. Clearly, assessing the DL's fitness for the purpose requires a systematic understanding of the problem context. In this study, we propose a requirements engineering framework that uses the problem frames to identify and structure the stakeholder concerns, analyses the physical situations in which the highquality data assumptions may not hold, and derives the software testing criteria in the form of metamorphic relations that incorporate both input transformations and output comparisons. Applying our framework to MSDGC's overflow prediction problem enables a principled way to evaluate different RNN solutions in meeting the requirements.
K E Y W O R D S deep learning, deep neural networks, software engineeringThis is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Testing deep learning systems requires expensive labeled data. In recent years, researchers began to leverage metamorphic testing to address this issue. However, metamorphic relations on image data remain poorly understood. To gain a deeper understanding of these metamorphic relations, we survey common image operations modeling covariate shift, manually classify and categorize the underlying metamorphic relations, and conduct experiments to validate our classifications. In our experiments, we train three popular convolutional neural network architectures on an image classification task. Next, we apply metamorphic operations on input test images and measure the change in classification accuracy and cross-entropy loss. A hierarchical clustering algorithm cluster these results and plots a dendrogram. We compare the groups from manual classification and the clusters from the algorithm to provide key insights. We find that Affine and Noise relations are consistent. Furthermore, we recommend metamorphic relationships to save time and better test deep learning systems in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.