Although database systems perform well in data access and manipulation, their relational model hinders data scientists from formulating machine learning algorithms in SQL. Nevertheless, we argue that modern database systems perform well for machine learning algorithms expressed in relational algebra. To overcome the barrier of the relational model, this paper shows how to transform data into a relational representation for training neural networks in SQL: We first describe building blocks for data transformation, model training and inference in SQL-92 and their counterparts using an extended array data type. Then, we compare the implementation for model training and inference using array data types to the one using a relational representation in SQL-92 only. The evaluation in terms of runtime and memory consumption proves the suitability of modern database systems for matrix algebra, although specialised array data types perform better than matrices in relational representation.
In order to handle the database load for web scale applications, the conventional wisdom is that a cluster of database servers and a caching layer are essential. In this work, we argue that modern main memory database systems are often fast enough to consolidate this complex architecture into a single server (plus an additional fail over system). To demonstrate this claim, we design the
Monopedia Benchmark
, a benchmark for web scale applications modeled after Wikipedia. Using this benchmark, we show that it is indeed possible to run the database workload of one of the largest web sites in the world on a single database server.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.