2012 IEEE Conference on High Performance Extreme Computing 2012
DOI: 10.1109/hpec.2012.6408678
|View full text |Cite
|
Sign up to set email alerts
|

Driving big data with big compute

Abstract: Abstract-Big Data (as embodied by Hadoop clusters) and Big Compute (as embodied by MPI clusters) provide unique capabilities for storing and processing large volumes of data. Hadoop clusters make distributed computing readily accessible to the Java community and MPI clusters provide high parallel efficiency for compute intensive workloads. Bringing the big data and big compute communities together is an active area of research. The LLGrid team has developed and deployed a number of technologies that aim to pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 30 publications
(24 citation statements)
references
References 7 publications
(6 reference statements)
0
24
0
Order By: Relevance
“…We have previously demonstrated data ingest rates in excess of four million records per second for our 8-node Accumulo instance, with speedup for up to 256 client processes parsing raw data and inserting records [11]. These ingest rates on the LLCySA system are roughly an order of magnitude faster than insert rates for traditional relational databases reported on the web [12], and the Accumulo architecture offers significantly more scalability.…”
Section: Case Study: Network Situational Awarenessmentioning
confidence: 93%
“…We have previously demonstrated data ingest rates in excess of four million records per second for our 8-node Accumulo instance, with speedup for up to 256 client processes parsing raw data and inserting records [11]. These ingest rates on the LLCySA system are roughly an order of magnitude faster than insert rates for traditional relational databases reported on the web [12], and the Accumulo architecture offers significantly more scalability.…”
Section: Case Study: Network Situational Awarenessmentioning
confidence: 93%
“…The reduce task will wait until all the mapper tasks are completed by setting a job dependency between the mapper tasks and the reducer task. LLGrid MapReduce is covered in more detail in [16].…”
Section: Batch Computing Jobs and Llmapreduce Jobsmentioning
confidence: 99%
“…To achieve parallelism, we utilized LLGrid's LLGrid MapReduce facility [7]. To evaluate the algorithms, we used the macro-averaged precision and recall to obtain an F 1 score for each algorithm as defined in [8].…”
Section: A Experiments Setupmentioning
confidence: 99%