Finding good configurations of a software system is often challenging since the number of configuration options can be large. Software engineers often make poor choices about configuration or, even worse, they usually use a sub-optimal configuration in production, which leads to inadequate performance. To assist engineers in finding the better configuration, this article introduces FLASH, a sequential model-based method that sequentially explores the configuration space by reflecting on the configurations evaluated so far to determine the next best configuration to explore. FLASH scales up to software systems that defeat the prior state-of-the-art model-based methods in this area. FLASH runs much faster than existing methods and can solve both single-objective and multi-objective optimization problems. The central insight of this article is to use the prior knowledge of the configuration space (gained from prior runs) to choose the next promising configuration. This strategy reduces the effort (i.e., number of measurements) required to find the better configuration. We evaluate FLASH using 30 scenarios based on 7 software systems to demonstrate that FLASH saves effort in 100% and 80% of cases in single-objective and multi-objective problems respectively by up to several orders of magnitude compared to state-of-the-art techniques.
Machine learning software is increasingly being used to make decisions that affect people's lives. But sometimes, the core part of this software (the learned model), behaves in a biased manner that gives undue advantages to a specific group of people (where those groups are determined by sex, race, etc.). This łalgorithmic discriminationž in the AI software systems has become a matter of serious concern in the machine learning and software engineering community. There have been works done to find łalgorithmic biasž or łethical biasž in software system. Once the bias is detected in the AI software system, mitigation of bias is extremely important. In this work, we a) explain how ground truth bias in training data affects machine learning model fairness and how to find that bias in AI software, b) propose a method Fairway which combines preprocessing and in-processing approach to remove ethical bias from training data and trained model. Our results show that we can find bias and mitigate bias in a learned model, without much damaging the predictive performance of that model. We propose that (1) testing for bias and (2) bias mitigation should be a routine part of the machine learning software development life cycle. Fairway offers much support for these two purposes. CCS CONCEPTS • Software and its engineering → Software creation and management; • Computing methodologies → Machine learning.
Literature reviews can be time-consuming and tedious to complete. By cataloging and refactoring three state-of-the-art active learning techniques from evidence-based medicine and legal electronic discovery, this paper finds and implements FASTREAD, a faster technique for studying a large corpus of documents, combining and parametrizing the most efficient active learning algorithms. This paper assesses FASTREAD using datasets generated from existing SE literature reviews (Hall, Wahono, Radjenović, Kitchenham et al.). Compared to manual methods, FASTREAD lets researchers find 95% relevant studies after reviewing an order of magnitude fewer papers. Compared to other state-of-the-art automatic methods, FASTREAD reviews 20-50% fewer studies while finding same number of relevant primary studies in a systematic literature review.
Literature reviews are essential for any researcher trying to keep up to date with the burgeoning software engineering literature. Finding relevant papers can be hard due to the huge amount of candidates provided by search. FAST 2 is a novel tool for assisting the researchers to find the next promising paper to read. This paper describes FAST 2 and tests it on four large systematic literature review datasets. We show that FAST 2 robustly optimizes the human effort to find most (95%) of the relevant software engineering papers while also compensating for the errors made by humans during the review process. The effectiveness of FAST 2 can be attributed to three key innovations: (1) a novel way of applying external domain knowledge (a simple two or three keyword search) to guide the initial selection of paperswhich helps to find relevant research papers faster with less variances; (2) an estimator of the number of remaining relevant papers yet to be found-which helps the reviewer decide when to stop the review; (3) a novel human error correction algorithm-which corrects a majority of human misclassifications (labeling relevant papers as non-relevant or vice versa) without imposing too much extra human effort.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.