2020 IEEE International Symposium on Workload Characterization (IISWC) 2020
DOI: 10.1109/iiswc50251.2020.00017
|View full text |Cite
|
Sign up to set email alerts
|

A Rigorous Benchmarking and Performance Analysis Methodology for Python Workloads

Abstract: Computer architecture and computer systems research and development is heavily driven by benchmarking and performance analysis. It is thus of paramount importance that rigorous methodologies are used to draw correct conclusions and steer research and development in the right direction. While rigorous methodologies are widely used for native and managed programming language workloads, scripting language workloads are subject to ad-hoc methodologies which lead to incorrect and misleading conclusions. In particul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…The measurement of execution time of non-idiomatic and idiomatic code is far from trivial due to the non-determinism such as Python Virtual Machine (VM) and garbage collector [7]. To overcome such non-determinism, researchers repeatedly execute the code multiple times [7], [5], [8], [26], [27], [28], [10], [9]. Since different VM invocations may result in different code execution time, and multiple executions of the code in a VM invocation may vary, we execute each code 35 iterations on 50 VM invocations as done in previous studies [7], [8], [9], [10].…”
Section: B Performance Measurementmentioning
confidence: 99%
See 2 more Smart Citations
“…The measurement of execution time of non-idiomatic and idiomatic code is far from trivial due to the non-determinism such as Python Virtual Machine (VM) and garbage collector [7]. To overcome such non-determinism, researchers repeatedly execute the code multiple times [7], [5], [8], [26], [27], [28], [10], [9]. Since different VM invocations may result in different code execution time, and multiple executions of the code in a VM invocation may vary, we execute each code 35 iterations on 50 VM invocations as done in previous studies [7], [8], [9], [10].…”
Section: B Performance Measurementmentioning
confidence: 99%
“…To overcome such non-determinism, researchers repeatedly execute the code multiple times [7], [5], [8], [26], [27], [28], [10], [9]. Since different VM invocations may result in different code execution time, and multiple executions of the code in a VM invocation may vary, we execute each code 35 iterations on 50 VM invocations as done in previous studies [7], [8], [9], [10]. Since first iterations (i.e., warm-up iterations) are subject to noise caused by library loading, measurements are only collected in iterations that are subsequent to warmup.…”
Section: B Performance Measurementmentioning
confidence: 99%
See 1 more Smart Citation
“…However, researchers have not yet reached a consensus on averaging the speedups of multiple benchmarks. Some 31 endorse the geometric mean, some 24,32 advocate the harmonic mean, and some 33 argue that the choice does not matter. Thus, we show both the geometric and harmonic mean speedup over time in Figure 10.…”
Section: The Mean Speedup Over Timementioning
confidence: 99%
“…The full suite or subsets thereof are also broadly recognized by relevant researchers. [23][24][25][26][27][28] These benchmarks exhibit varying characteristics, as they focus on different application scenarios: arithmetic calibration, HTML template rendering, regular expression processing, object serialization, database accessing and so forth. Some of these benchmarks are Python-dominant, primarily leveraging pure Python code to complete their computational tasks, while some are native-dominant, drawing heavily on functionalities implemented in native code provided by the interpreter or extensions.…”
Section: The Benchmark Suitementioning
confidence: 99%