Abstract-Usage-based statistical testing employs knowledge about the actual or anticipated usage profile of the system under test for estimating system reliability. For many systems, usagebased statistical testing involves generating synthetic test data. Such data must possess the same statistical characteristics as the actual data that the system will process during operation. Synthetic test data must further satisfy any logical validity constraints that the actual data is subject to. Targeting data-intensive systems, we propose an approach for generating synthetic test data that is both statistically representative and logically valid. The approach works by first generating a data sample that meets the desired statistical characteristics, without taking into account the logical constraints. Subsequently, the approach tweaks the generated sample to fix any logical constraint violations. The tweaking process is iterative and continuously guided toward achieving the desired statistical characteristics. We report on a realistic evaluation of the approach, where we generate a synthetic population of citizens' records for testing a public administration IT system. Results suggest that our approach is scalable and capable of simultaneously fulfilling the statistical representativeness and logical validity requirements.
The General Data Protection Regulation (GDPR) harmonizes data privacy laws and regulations across Europe. Through the GDPR, individuals are able to better control their personal data in the face of new technological developments. While the GDPR is highly advantageous to individuals, complying with it poses major challenges for organizations that control or process personal data. Since no automated solution with broad industrial applicability currently exists for GDPR compliance checking, organizations have no choice but to perform costly manual audits to ensure compliance. In this paper, we share our experience building a UML representation of the GDPR as a first step towards the development of future automated methods for assessing compliance with the GDPR. Given that a concrete implementation of the GDPR is affected by the national laws of the EU member states, GDPR's expanding body of case law and other contextual information, we propose a two-tiered representation of the GDPR: a generic tier and a specialized tier. The generic tier captures the concepts and principles of the GDPR that apply to all contexts, whereas the specialized tier describes a specific tailoring of the generic tier to a given context, including the contextual variations that may impact the interpretation and application of the GDPR. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research.
Abstract. Many laws, e.g., those concerning taxes and social benefits, need to be operationalized and implemented into public administration procedures and eGovernment applications. Where such operationalization is warranted, the legal frameworks that interpret the underlying laws are typically prescriptive, providing procedural rules for ensuring legal compliance. We propose a UML-based approach for modeling procedural legal rules. With help from legal experts, we investigate actual legal texts, identifying both the information needs and sources of complexity in the formalization of procedural legal rules. Building on this study, we develop a UML profile that enables more precise modeling of such legal rules. To be able to use logic-based tools for compliance analysis, we automatically transform models of procedural legal rules into the Object Constraint Language (OCL). We report on an application of our approach to Luxembourg's Income Tax Law providing initial evidence for the feasibility and usefulness of our approach.
The ability to generate test data is often a necessary prerequisite for automated software testing. For the generated data to be fit for its intended purpose, the data usually has to satisfy various logical constraints. When testing is performed at a system level, these constraints tend to be complex and are typically captured in expressive formalisms based on first-order logic. Motivated by improving the feasibility and scalability of data generation for system testing, we present a novel approach, whereby we employ a combination of metaheuristic search and Satisfiability Modulo Theories (SMT) for constraint solving. Our approach delegates constraint solving tasks to metaheuristic search and SMT in such a way as to take advantage of the complementary strengths of the two techniques. We ground our work on test data models specified in UML, with OCL used as the constraint language. We present tool support and an evaluation of our approach over three industrial case studies. The results indicate that, for complex system test data generation problems, our approach presents substantial benefits over the state of the art in terms of applicability and scalability.
Simulation of legal policies is an important decision-support tool in domains such as taxation. The primary goal of legal policy simulation is predicting how changes in the law affect measures of interest, e.g., revenue. Legal policy simulation is currently implemented using a combination of spreadsheets and software code. Such a direct implementation poses a validation challenge. In particular, legal experts often lack the necessary software background to review complex spreadsheets and code. Consequently, these experts currently have no reliable means to check the correctness of simulations against the requirements envisaged by the law. A further challenge is that representative data for simulation may be unavailable, thus necessitating a data generator. A hard-coded generator is difficult to build and validate. We develop a framework for legal policy simulation that is aimed at addressing the challenges above. The framework uses models for specifying both legal policies and the probabilistic characteristics of the underlying population. We devise an automated algorithm for simulation data generation. We evaluate our framework through a case study on Luxembourg's Tax Law.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.