Omer Dawelbeit scite author profile

Omer Dawelbeit

3Publications

1Citation Statement Received

27Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Reading

Publications

Order By: Most citations

Efficient Dictionary Compression for Processing RDF Big Data Using Google BigQuery

Dawelbeit

McCrindle

2016

View full text Add to dashboard Cite

The Resource Description Framework (RDF) data model, is used on the Web to express billions of structured statements in a wide range of topics, including government, publications, life sciences, etc. Consequently, processing and storing this data requires the provision of high specification systems, both in terms of storage and computational capabilities. On the other hand, cloud-based big data services such as Google BigQuery can be used to store and query this data without any upfront investment. Google BigQuery pricing is based on the size of the data being stored or queried, but given that RDF statements contain long Uniform Resource Identifiers (URIs), the cost of query and storage of RDF big data can increase rapidly. In this paper we present and evaluate a novel and efficient dictionary compression algorithm which is faster, generates small dictionaries that can fit in memory and results in better compression rate when compared with other large scale RDF dictionary compression. Consequently, our algorithm also reduces the BigQuery storage and query cost.

show abstract

A novel cloud based elastic framework for big data preprocessing

Dawelbeit

McCrindle

2014

View full text Add to dashboard Cite

A number of analytical big data services based on the cloud computing paradigm such as Amazon Redshift and Google Bigquery have recently emerged. These services are based on columnar databases rather than traditional Relational Database Management Systems (RDBMS) and are able to analyse massive datasets in mere seconds. This has led many organisations to retain and analyse their massive logs, sensory or marketing datasets, which were previously discarded due to the inability to either store or analyse them. Although these big data services have addressed the issue of big data analysis, the ability to efficiently de-normalise and prepare this data to a format that can be imported into these services remains a challenge. This paper describes and implements a novel, generic and scalable cloud based elastic framework for Big Data Preprocessing (BDP). Since the approach described by this paper is entirely based on cloud computing it is also possible to measure the overall cost incurred by these preprocessing activities.

show abstract

CloudEx: A Novel Cloud-Based Task Execution Framework

Dawelbeit

McCrindle

2016

View full text Add to dashboard Cite

In recent years cloud computing has seen steady adoption due to its unique features such as elasticity, faulttolerance and utility billing. Cloud computing Infrastructure-asa-Service (IaaS) enables unique architectures that can dynamically scale and configure computing resources from a catalogue of available features. In addition to provisioning long running homogeneous clusters of Virtual Machines (VMs), it can also be feasible to provision ephemeral and heterogeneous per-job VMs. This is made possible due to the reduced VM startup time and per-minute billing for cloud VMs. In this paper we design and implement CloudEx, a generic and novel framework for executing jobs on public clouds by leveraging the Google Cloud Platform. CloudEx enables users to split jobs into a sequence of smaller tasks that can be distributed using Bin Packing or user-defined algorithm. Additionally, users can specify the VM specification per job or per task, CloudEx then provisions the required VMs, coordinates the job execution and terminates these VMs once the job is completed.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Omer Dawelbeit

Efficient Dictionary Compression for Processing RDF Big Data Using Google BigQuery

A novel cloud based elastic framework for big data preprocessing

CloudEx: A Novel Cloud-Based Task Execution Framework

Contact Info

Product

Resources

About