Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data 2020
DOI: 10.1145/3318464.3389768
|View full text |Cite
|
Sign up to set email alerts
|

Active Learning for ML Enhanced Database Systems

Abstract: Recent research has shown promising results by using machine learning (ML) techniques to improve the performance of database systems, e.g., in query optimization or index recommendation. However, in many production deployments, the ML models' performance degrades significantly when the test data diverges from the data used to train these models. In this paper, we address this performance degradation by using B-instances to collect additional data during deployment. We propose an active data collection platform… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 48 publications
(13 citation statements)
references
References 44 publications
0
13
0
Order By: Relevance
“…Input : E p , E s The current adjusting scheme is actually an empirical algorithm. Although our experiments suggest the effectiveness of the algorithm, it is more attractive to employ a machine-learning algorithm [29,30] to make AMG-Buffer learn the access pattern, which can be further used to adjust the size of the P-Buffer.…”
Section: Algorithm 3: Adjust the P-buffermentioning
confidence: 97%
“…Input : E p , E s The current adjusting scheme is actually an empirical algorithm. Although our experiments suggest the effectiveness of the algorithm, it is more attractive to employ a machine-learning algorithm [29,30] to make AMG-Buffer learn the access pattern, which can be further used to adjust the size of the P-Buffer.…”
Section: Algorithm 3: Adjust the P-buffermentioning
confidence: 97%
“…Recently, there has been significant interest in using machine learning for database tuning [12,19,20,25,27]. Our work falls into the same, broad category as it exploits RL.…”
Section: Related Workmentioning
confidence: 99%
“…Sampling from discrete distributions can be achieved with (among others) inverse transform sampling, or the Gumbel-max trick [4] (see Section 4.1.1) and extensions thereof (see Section 4.3). Gumbel-based sampling algorithms have for example been used for (discrete) action selection in a multi-armed bandit setting [10], for sampling data points in active learning [11], for text generation in dialog systems [12] or in translation tasks [13], [14].…”
Section: Applicationsmentioning
confidence: 99%