2013 IEEE High Performance Extreme Computing Conference (HPEC) 2013
DOI: 10.1109/hpec.2013.6670330
|View full text |Cite
|
Sign up to set email alerts
|

Understanding query performance in Accumulo

Abstract: Open-source, BigTable-like distributed databases provide a scalable storage solution for data-intensive applications. The simple key-value storage schema provides fast record ingest and retrieval, nearly independent of the quantity of data stored. However, real applications must support non-trivial queries that require careful key design and value indexing. We study an Apache Accumulo-based big data system designed for a network situational awareness application. The application's storage schema and data retri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…The strategy in [14] also used tablet location information to determine where clients could write locally. Knowing tabletto-tablet-server assignment could likewise aid Graphulo, not only to minimize network traffic but also to partly eliminate Apache Thrift RPC serialization, which prior work has shown is a bottleneck for scans when iterator processing is light [18]. Such an enhancement would access a local tablet server by method call in place of Scanners and BatchWriters.…”
Section: A Related Workmentioning
confidence: 99%
“…The strategy in [14] also used tablet location information to determine where clients could write locally. Knowing tabletto-tablet-server assignment could likewise aid Graphulo, not only to minimize network traffic but also to partly eliminate Apache Thrift RPC serialization, which prior work has shown is a bottleneck for scans when iterator processing is light [18]. Such an enhancement would access a local tablet server by method call in place of Scanners and BatchWriters.…”
Section: A Related Workmentioning
confidence: 99%
“…However it is some research seems to suggests that this problem can be easily bypassed using a big data structure. The problem of the storage size and the query performances could also be explored better even if some research already been conducted [23].…”
Section: Performance Analysismentioning
confidence: 99%
“…The peak insert rate for a single thread is typically ~100,000 entries per second. A typical single node server can reach ~500,000 entries per second using several insert threads [Sawyer 2013]. For the hypothetical Hadoop cluster described in the previous section, the peak performance would be ~100,000,000 entries per second.…”
Section: Accumulomentioning
confidence: 99%
“…IV. ACCUMULO Relational or SQL (Structured Query Language) databases[Codd 1970, Stonebraker et al 1976 have been the de facto interface to databases since the 1980s and are the bedrock of electronic transactions around the world. More recently, keyvalue stores (NoSQL databases)[Chang et al 2008] have been developed for representing large sparse tables to aid in the analysis of data for Internet search.…”
mentioning
confidence: 99%