In recent years, computational science and engineering (CSE) simulations using high-performance computing resources are actively exploited to solve complex domain-specific problems. Thanks to the remarkable advance of IT technology, the CSE community is challenging more complex and difficult problems than ever before, by running these simulations online. In this regard, we often witness that 1) online simulation users suffer from knowing little about the estimated termination time of their launched simulations and 2) the limited computing resources are squandered by wrong input that leads the simulations to run forever. To address such issues, we propose a novel execution time estimation scheme, termed EXTES, using machine learning techniques for more efficient online CSE simulations. With a large amount of existing provenance data, the EXTES scheme trains a suite of models rooted from classification, regression, and a hybrid of the two and utilize these models to estimate the execution time for specified input parameters for simulations. In the experiments on real simulation data, our proposed models achieved about 73% accuracy on average in execution time estimation across 16 simulation programs taken from a variety of CSE fields. In the meantime, the overhead incurred by the training and estimation is almost negligible.
Young-Kyoon SUH †a) , Member, Seounghyeon KIM †b) , Joo-Young LEE †c) , Hawon CHU †d) , Junyoung AN †e) , and Kyong-Ha LEE † †f) , Nonmembers
SUMMARYIn this letter we analyze the economic worth of GPU on analytical processing of GPU-accelerated database management systems (DBMSes). To this end, we conducted rigorous experiments with TPC-H across three popular GPU DBMSes. Consequently, we show that co-processing with CPU and GPU in the GPU DBMSes was cost-effective despite exposed concerns.
Recently, various environmental data, such as microdust pollution, temperature, humidity, etc., have been continuously collected by widely deployed Internet of Things (IoT) sensors. Although these data can provide great insight into developing sustainable application services, it is challenging to rapidly retrieve such data, due to their multidimensional properties and huge growth in volume over time. Existing indexing methods for efficiently locating those data expose several problems, such as high administrative cost, spatial overhead, and slow retrieval performance. To mitigate these problems, we propose a novel indexing scheme termed ST-Trie, for efficient retrieval over spatiotemporal IoT environment data. Given IoT sensor data with latitude, longitude, and time, the proposed scheme first converts the three-dimensional attributes to one-dimensional index keys. The scheme then builds a trie-based index, consisting of internal nodes inserted by the converted keys and leaf nodes containing the keys and pointers to actual IoT data. We leverage this index to process various types of queries. In our experiments with three real-world datasets, we show that the proposed ST-Trie index outperforms existing approaches by a substantial margin regarding response time. Furthermore, we show that the query processing performance via ST-Trie also scales very well with an increasing time interval. Finally, we demonstrate that when compressed, the ST-Trie index can significantly reduce its space overhead by approximately a factor of seven.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.