Size matters? Or not: A/B testing with limited sample in automotive embedded software

Liu, Yuchu; Mattos, David Issa; Bosch, Jan; Olsson, Helena Holmström; Lantz, Jonn

doi:10.1109/seaa53835.2021.00046

Cited by 10 publications

(9 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In recent years, there is an increasing interest of adopting online experimentation in the automotive domain but the ability to conduct large scale and fully randomised experiments is significantly more limited, as reported in [4], [6], [8], [10], [17]. The automotive domain faces many unique restrictions compared to SaaS companies, such as the number of hardware and software variants [4], architecture restrictions [5], safety regulation constraints, number of vehicles available for experimentation [6], driver consent, and the ability to frequently update software [4], [10]. A combination of these challenges leads to many situations where a t X y Fig.…”

Section: Randomised Experimentationmentioning

confidence: 99%

“…The first challenge in adopting online experimentation is the limited access to the entire user base. To start, the automotive domain has a significantly smaller user based comparing to the SaaS domain, as a result of product diversity and hardware dependency [4], [6]. Moreover, in combination with the limitation of safety-critical software and the lack of explicit user agreements, shipping new software to the entire fleet is typically undesired, impossible, or unethical.…”

Section: Boatmentioning

confidence: 99%

“…Propensity score matching, first proposed by Rosenbaum and Rubin [18], is a method for matching samples from the control and treatment group based on the propensity score calculated from observed covariates, thus adjusting for covariates and estimating unbias treatment effects. The Bayesian propensity score matching model is used for designing a balanced control and treatment group in traffic safety analysis [13] and in automotive software engineering [6].…”

Section: Boatmentioning

confidence: 99%

“…To ensure the technical solution developed is relevant to the domain, we derive the problem from existing literature addressing the challenges on online experimentation adaptation in automotive [4], [5], [6], [7], [8], and literature stating the challenges of online experimentation in other domains [1], [2], [3]. All of the literature included in the analysis are based on empirical research in their respective domains.…”

Section: Guideline 2: Problem Relevancementioning

confidence: 99%

“…3, as can be seen, samples are matched based on their propensity score similarity, also called propensity score distance. In previous literature, propensity score matching has been used for experiment design when the sample size is small [6], [35], [36], [44], and for causal effect analysis post facto [11], [13]. When propensity score estimate is done through a Bayesian network, literature reports the sensitivity to sample size being lower comparing to the conventional propensity score model [13].…”

Section: Bayesian Propensity Score Matchingmentioning

confidence: 99%

See 4 more Smart Citations

Bayesian causal inference in automotive software engineering and online evaluation

Liu¹,

Mattos²,

Bosch³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Randomised field experiments, such as A/B testing, have long been the gold standard for evaluating software changes. In the automotive domain, running randomised field experiments is not always desired, possible, or even ethical. In the face of such limitations, we develop a framework BOAT (Bayesian causal modelling for ObvservAtional Testing), utilising observational studies in combination with Bayesian causal inference, in order to understand real-world impacts from complex automotive software updates and help software development organisations arrive at causal conclusions. In this study, we present three causal inference models in the Bayesian framework and their corresponding cases to address three commonly experienced challenges of software evaluation in the automotive domain. We develop the BOAT framework with our industry collaborator, and demonstrate the potential of causal inference by conducting empirical studies on a large fleet of vehicles. Moreover, we relate the causal assumption theories to their implications in practise, aiming to provide a comprehensive guide on how to apply the causal models in automotive software engineering. We apply Bayesian propensity score matching for producing balanced control and treatment groups when we do not have access to the entire user base, Bayesian regression discontinuity design for identifying covariate dependent treatment assignments and the local treatment effect, and Bayesian difference-in-differences for causal inference of treatment effect overtime and implicitly control unobserved confounding factors. Each one of the demonstrative case has its grounds in practise, and is a scenario experienced when randomisation is not feasible. With the BOAT framework, we enable online software evaluation in the automotive domain without the need of a fully randomised experiment.

show abstract

Section: Randomised Experimentationmentioning

confidence: 99%

Section: Boatmentioning

confidence: 99%

Section: Boatmentioning

confidence: 99%

Section: Guideline 2: Problem Relevancementioning

confidence: 99%

Section: Bayesian Propensity Score Matchingmentioning

confidence: 99%

See 3 more Smart Citations

Bayesian causal inference in automotive software engineering and online evaluation

Liu¹,

Mattos²,

Bosch³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

Strategic Digital Product Management in the Age of AI

Olsson,

Bosch

2024

Lecture Notes in Business Information Processing

View full text Add to dashboard Cite

The role of software product management is key for building, implementing and managing software products. However, although there is prominent research on software product management (SPM) there are few studies that explore how this role is rapidly changing due to digitalization and digital transformation of the software-intensive industry. In this paper, we study how key trends such as DevOps, data and artificial intelligence (AI), and the emergence of digital ecosystems are rapidly changing current SPM practices. Whereas earlier, product management was concerned with predicting the outcome of development efforts and prioritizing requirements based on these predictions, digital technologies require a shift towards experimental ways-of-working and hypotheses to be tested. To support this change, and to provide guidelines for future SPM practices, we first identify the key challenges that software-intensive embedded systems companies experience with regards to current SPM practices. Second, we present an empirically derived framework for strategic digital product management (SPM4AI) in which we outline what we believe are key practices for SPM in the age of AI.

show abstract

Comparison of multi-criteria decision-making methods for online controlled experiments in a launch decision-making framework

Mazzuchi

Sarkani

2023

Information and Software Technology

View full text Add to dashboard Cite

Size matters? Or not: A/B testing with limited sample in automotive embedded software

Cited by 10 publications

References 21 publications

Bayesian causal inference in automotive software engineering and online evaluation

Bayesian causal inference in automotive software engineering and online evaluation

Strategic Digital Product Management in the Age of AI

Comparison of multi-criteria decision-making methods for online controlled experiments in a launch decision-making framework

Contact Info

Product

Resources

About