ASTERIX: towards a scalable, semistructured data platform for evolving-world models

Behm, Alexander; Borkar, Vinayak; Carey, Michael J.; Grover, Raman; Li, Chen; Onose, Nicola; Vernica, Rares; Deutsch, Alin; Papakonstantinou, Yannis; Tsotras, Vassilis J.

doi:10.1007/s10619-011-7082-y

Cited by 95 publications

(53 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hive, Pig, and Tenzing focus mostly on providing to end users a higher level declarative interface that is compiled down to MapReduce tasks [32,27,15]. Hyracks and Asterix [12,11] are exploring more flexible building blocks than just map and reduce to build massively parallel databases. Greenplum and Aster Data have added the ability to execute MapReduce-style functions over data stored in these systems.…”

Section: Related Workmentioning

confidence: 99%

Shark: Fast Data Analysis Using Coarse-grained Distributed Memory

Engle¹

2013

View full text Add to dashboard Cite

Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Copyright © 2013, by the author(s).All rights reserved.Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.Shark: Fast Data Analysis Using Coarse-grained Distributed Memory Clifford EngleAbstract Shark is a research data analysis system built on a novel coarse-grained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a unified system for easy data manipulation using SQL and pushing sophisticated analysis closer to data. It scales to thousands of nodes in a fault-tolerant manner. Shark can answer queries 40X faster than Apache Hive and run machine learning programs 25X faster than MapReduce programs in Apache Hadoop on large datasets. This is a complete overview of the development of Shark, including design decisions, performance details, and comparison with existing data warehousing solutions. It demonstrates some of Shark's distinguishing features including its in-memory columnar caching and its unified machine learning interface.

show abstract

Section: Related Workmentioning

confidence: 99%

Shark: Fast Data Analysis Using Coarse-grained Distributed Memory

Engle¹

2013

View full text Add to dashboard Cite

show abstract

“…4 and if the number exceeds thirty two, the situation can be resolved by establishing links with two or more relay nodes. The model is shaped as a "Star" form as a single router can establish links with thirty two nodes, and has all the features of routers (OPNET) [12][13][14][15][16][17].…”

Section: Plc Router Node Modelmentioning

confidence: 99%

Simulation of Power Line Communication Slient Node Problem Using OPNET

Huh¹,

Seo²

2017

Journal of Korea Multimedia Society

View full text Add to dashboard Cite

The Information & Communication Technology (ICT) and the Internet of Things (IoT) have become the major issues in Republic of Korea recently. While RS-232, Zigbee, and WiFi-related technologies are used in the ICT-based systems, we focus on the Power Line Communication (PLC) in this paper. By carrying out OPNET simulations, we've implemented the PLC Router Node Model, PLC Terminal Node Model, PLC Link Model, and PLC Palette Model and executed the simulations arranging 20 holds within the range of 400m (20m apart). As a result, we confirmed that the slient node problem had occurred at the point of 200m-2000m (as of 2016) distance preventing further communications. However, the control group, by contrast, was able to carry out the communications by installing a router. We expect that this paper will contribute to the development of a foundation technology which will saves costs by performing the simulation prior to building actual large-scale ICT Complex in the future work.

show abstract

“…Since the use of PC, internet or mobile devices has become people's daily routine, the volume of data left behind by them is increasing rapidly [3][4]. Along with the fact that the volume of big data has increased explosively, the types of data have been also diversified such that people's behaviors, as well as their thoughts and opinions can be anticipated through positional information and SNS services.…”

Section: Related Researchmentioning

confidence: 99%

“…Currently in the Republic of Korea (ROK), for the fishing industry, it is expected that the building-type fish farms will emerge in the town centers or suburbs soon as the research on the building-type fish farms led by the Ministry of Maritime Affairs and Fisheries is near completion [1][2][3][4][5][6][7][8]. At the same time, for the Agriculture sector, the crop farms that utilize LED lightings could appear at the sites centering around downtown areas.…”

Section: Introductionmentioning

confidence: 99%

A Preliminary Analysis Model of Big Data for Prevention of Bioaccumulation of Heavy Metal-Based Pollutants: Focusing on the Atmospheric Data Analyses

Kim¹,

Seo²

2016

Advanced Science and Technology Letters

View full text Add to dashboard Cite

Abstract. In the Republic of Korea, the building-type fish and agricultural farms are expected to emerge in the town areas or suburbs. Developed farming technologies that employ water recirculation equipments or LED lights are becoming are becoming more common and convenient. However, there are still some requirements required to successfully operate the farms and these requirements must be identified through analyses of various factors surrounding farms. This study conducts a research to obtain the analytical results and investigates their characteristics through visualization of the atmospheric environment data of Gangnam District provided by the Seoul Metropolitan Government in order to perform modeling of the preliminary big data analysis against the pollutants as a countermeasure to the bioaccumulation of heavy metals in the agricultural and marine products. The basic research was performed by visualizing the data obtained form the univariate, simple and multiple regression analyses for easy viewing, finding the a log-transformed model, and modeling overall characteristics through categorization of the explanatory variables. We hope that this research will assist the farmers in selecting their farming locations.

show abstract

ASTERIX: towards a scalable, semistructured data platform for evolving-world models

Cited by 95 publications

References 31 publications

Shark: Fast Data Analysis Using Coarse-grained Distributed Memory

Shark: Fast Data Analysis Using Coarse-grained Distributed Memory

Simulation of Power Line Communication Slient Node Problem Using OPNET

A Preliminary Analysis Model of Big Data for Prevention of Bioaccumulation of Heavy Metal-Based Pollutants: Focusing on the Atmospheric Data Analyses

Contact Info

Product

Resources

About