2015
DOI: 10.1007/978-3-319-15350-6_1
|View full text |Cite
|
Sign up to set email alerts
|

Introducing TPCx-HS: The First Industry Standard for Benchmarking Big Data Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 23 publications
(7 citation statements)
references
References 2 publications
0
7
0
Order By: Relevance
“…It gives an objective measurement of the system being tested and through this, the industry can compare multiple existing solutions based on several metrics. The TPCx‐HS benchmark focuses on testing commercialised Apache Hadoop systems [54, 55]. The testing covers both the hardware and software aspects of the system, which includes the operating system.…”
Section: Emerging and Existing Big Data Standardsmentioning
confidence: 99%
See 1 more Smart Citation
“…It gives an objective measurement of the system being tested and through this, the industry can compare multiple existing solutions based on several metrics. The TPCx‐HS benchmark focuses on testing commercialised Apache Hadoop systems [54, 55]. The testing covers both the hardware and software aspects of the system, which includes the operating system.…”
Section: Emerging and Existing Big Data Standardsmentioning
confidence: 99%
“…A CIM RDF schema proposed in the IEC 61970‐501 standard is used to construct CML documents with power system model information, so that they can be exchanged. Key Takeaway Points: TPC benchmarks can be considered in tune with CMMs. They are application benchmarks for big data towards an industry‐standard benchmark for big data analytics. The authors of [47, 48, 51] cover various aspects of data security and are adaptable to all DG use‐cases, as described later in this paper. Consortia such as the Organization for the Advancement of Structured Information Standards (OASIS), the World Wide Web Consortium (W3C), the Open Geospatial Consortium (OGC), the Open GRASS Foundation (OGF), and others have created standards for data, cited to have relevance to big data. The authors of [50, 52] provide an overview of understanding and creating big data projects. TPC Benchmarks [53–55] might provide better insight into the operation of big data analytics systems and might help users re‐evaluate their implemented models. …”
Section: Emerging and Existing Big Data Standardsmentioning
confidence: 99%
“…Another benchmark is the TPCx-HS that adds formal specification and enforcement rules that enable comparison of results among systems [35]. One characteristic of TPCx-HS is that it follows a stepped scaled sizing model on BigdataBench.…”
Section: Big-data Benchmarksmentioning
confidence: 99%
“…It is considered that the data scale that was used in this study of up to 2 TB was a good indicator that KERMIT can perform efficiently in real-world big data applications. This was one of the team's goals because MRONLINE [25], the other research project that dealt with automatic tuning of Hadoop/YARN, used data sizes of a maximum of 100 GB; this would be considered small if compared to leading benchmarks like TPCx-HS [54] that have 1 TB as the smallest data scale allowed for that specific benchmark. The data scale is of particular interest; the results at different data scales are not comparable to each other because the computational challenges can vary considerably at different data volumes.…”
Section: Results Analysismentioning
confidence: 99%
“…It was inspired by the TeraSort/TeraGen/TeraValidate utilities that have been commonly used by big data practitioners for several years. TPCx-HS requires that the generate, sort and validate jobs sequence is executed sequentially without interruptions [54]. It specifically disallows manual tuning between the benchmark stages, but it explicitly allows for automatic tuning.…”
Section: Hadoop Benchmarksmentioning
confidence: 99%