A key technology to analyze high volume spatio-temporal data streams is complex event processing (CEP). CEP is unique in its ability to not only continuously process data as it arrives through common operations such as aggregations, but also to support pattern matching queries. Pattern Matching allows to detect a user-defined sequence of temporal predicates on event streams. The high volume flight data as provided by the OpenSky Network has a lot of characteristics that make it a perfect match for CEP. In particular, pattern matching operators can be utilized to detect a plethora of movement (landing, starting, evasion) and group patterns (airplanes closing in on each other) in a timely manner. However, CEP queries can be complex in nature and may require a combination of domain expertise and historical data analysis in order to deliver the desired results. In order to address these issues, we have combined a database-backed CEP system (ChronicleDB) with a scientific toolbox for interactive data exploration and geo visualization (Vat System). This allows users to interactively execute CEP queries and visually confirm the validity of their results, thus, simplifying the parameter tuning considerably.In addition, our solution supports efficient and interactive time travel queries. It allows to combine event streams with additional data sources (e.g., remote sensing images) and processing technologies (e.g., machine learning models) to extract higher level knowledge. Finally, our ongoing work on visual analytics explores extrapolating query results to provide more timely feedback for critical situations and multi-query optimization techniques to allow for an even more efficient system in general.
Event stores face the difficult challenge of continuously ingesting massive temporal data streams while satisfying demanding query and recovery requirements. Many of today’s systems deal with multiple hardware-based trade-offs. For instance, long-term storage solutions balance keeping data in cheap secondary media (SSDs, HDDs) and performance-oriented main-memory caches. As an alternative, in-memory systems focus on performance, while sacrificing monetary costs, and, to some degree, recovery guarantees. The advent of persistent memory (PMem) led to a multitude of novel research proposals aiming to alleviate those trade-offs in various fields. So far, however, there is no proposal for a PMem-powered specialized event store.
Based on ChronicleDB, we will present several complementary approaches for a three-layer architecture featuring main memory, PMem, and secondary storage. We enhance some of ChronicleDB’s components with PMem for better insertion and query performance as well as better recovery guarantees. At the same time, the three-layer architecture aims to keep the overall dollar cost of a system low. The limitations and opportunities of a PMem-enhanced event store serve as important groundwork for comprehensive system design exploiting a modern storage hierarchy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.