“…In essence, a data lake is a flexible, scalable data storage and management system, which ingests and stores raw data from heterogeneous sources in their original format, and provides maintenance, query processing and data analytics in an on-the-fly manner, with the help of rich metadata [116], [138], [142], [143]. Data lakes are proposed to store and manage data in many real-life use cases: Internet of things (IoT) and smart city [99], manufacturing [112], medicine [42], [55], [114], mobility service (e.g., Uber) [50], biology [23], smart grids [20], [103], air quality control [145], flights data [96], disease control, labor markets and products [13].…”