Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice 2020
DOI: 10.1145/3377813.3381349
|View full text |Cite
|
Sign up to set email alerts
|

Engineering for a science-centric experimentation platform

Abstract: Netflix is an internet entertainment service that routinely employs experimentation to guide strategy around product innovations. As Netflix grew, it had the opportunity to explore increasingly specialized improvements to its service, which generated demand for deeper analyses supported by richer metrics and powered by more diverse statistical methodologies. To facilitate this, and more fully harness the skill sets of both engineering and data science, Netflix engineers created a science-centric experimentatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 36 publications
0
5
0
Order By: Relevance
“…It allows us to understand whether changing a ranking algorithm improves search results, sending a notification increases app visits, or a bug fix that should do nothing has any surprising results. When possible, A/B testing is the gold standard for understanding treatment effects, and therefore companies are increasingly using A/B tests to make decisions (Yin and Hong, 2019;Diamantopoulos et al, 2020;Karrer et al, 2021). LinkedIn currently runs hundreds of A/B tests per day, analyzing the impact of both major and minor adjustments, and always attempting to make data-driven decisions built on insights from experiments (Xu et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…It allows us to understand whether changing a ranking algorithm improves search results, sending a notification increases app visits, or a bug fix that should do nothing has any surprising results. When possible, A/B testing is the gold standard for understanding treatment effects, and therefore companies are increasingly using A/B tests to make decisions (Yin and Hong, 2019;Diamantopoulos et al, 2020;Karrer et al, 2021). LinkedIn currently runs hundreds of A/B tests per day, analyzing the impact of both major and minor adjustments, and always attempting to make data-driven decisions built on insights from experiments (Xu et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Controlled experiments, also known as "A/B testing," continue to serve as the cornerstone for making strategic decisions in business, including new product launches, marketing campaigns, and algorithm updates (Bakshy et al 2014, Kohavi et al 2020, Diamantopoulos et al 2020, Bojinov and Gupta 2022, Koning et al 2022. Through the random assignment of treatment or control groups, A/B testing facilitates the evaluation of causal, rather than merely correlational, impacts of a product intervention on business outcomes.…”
Section: Introductionmentioning
confidence: 99%
“…First, it must be able to scale both to large sample sizes, which can be as large as hundreds of millions of observations, and to many features, sometimes in the thousands. Second, it should be reproducible and extensible such that software engineers and researchers can interact with, iterate on, and subsequently contribute to it (Diamantopoulos et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Addressing the second challenge, Netflix described an inclusive XP that makes use of single-machine computation for modeling, allowing it to be more interactive and consistent with the way researchers iterate (Diamantopoulos et al, 2020). As a result, researchers can reproduce analyses from the XP, iterate, follow up, and debug using Python and R, and then contribute improvements to statistical methodology back to the engineering systems.…”
Section: Introductionmentioning
confidence: 99%