Time series databases are essential for the large-scale deployment of many critical industrial applications. In infrastructure monitoring, for instance, a database system should be able to process large amounts of sensor data in real-time, execute continuous queries, and handle complex analytical queries such as anomaly detection or forecasting. Several benchmarks have been proposed to evaluate and understand how existing systems and design choices handle specific use cases and workloads. Unfortunately, none of them fully covers the peculiar requirements of monitoring applications. Furthermore, they fall short of providing an automated way to generate representative real-world data and workloads for testing and evaluating these systems.
We present TSM-Bench, a benchmark tailored for time series database systems used in monitoring applications. Our key contributions consist of (1) representative queries that meet the requirements that we collected from a water monitoring use case, and (2) a new scalable data generator method based on Generative Adversarial Networks (GAN) and Locality Sensitive Hashing (LSH). We demonstrate, through an extensive set of experiments, how TSM-Bench provides a comprehensive evaluation of the performance of seven leading time series database systems while offering a detailed characterization of their capabilities and trade-offs.