A Rigorous Benchmarking and Performance Analysis Methodology for Python Workloads

Crape, Arthur; Eeckhout, Lieven

doi:10.1109/iiswc50251.2020.00017

Cited by 6 publications

(7 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The measurement of execution time of non-idiomatic and idiomatic code is far from trivial due to the non-determinism such as Python Virtual Machine (VM) and garbage collector [7]. To overcome such non-determinism, researchers repeatedly execute the code multiple times [7], [5], [8], [26], [27], [28], [10], [9]. Since different VM invocations may result in different code execution time, and multiple executions of the code in a VM invocation may vary, we execute each code 35 iterations on 50 VM invocations as done in previous studies [7], [8], [9], [10].…”

Section: B Performance Measurementmentioning

confidence: 99%

“…To overcome such non-determinism, researchers repeatedly execute the code multiple times [7], [5], [8], [26], [27], [28], [10], [9]. Since different VM invocations may result in different code execution time, and multiple executions of the code in a VM invocation may vary, we execute each code 35 iterations on 50 VM invocations as done in previous studies [7], [8], [9], [10]. Since first iterations (i.e., warm-up iterations) are subject to noise caused by library loading, measurements are only collected in iterations that are subsequent to warmup.…”

Section: B Performance Measurementmentioning

confidence: 99%

“…To measure and compare the code execution time reliably, we execute a piece of code in 50 Virtual Machine (VM) invocations and collect the execution time of 35 execution (excluding the first 3 warm-up executions) in each VM invocation, following the process in previous studies [7], [8], [9], [10], [11]. We perform the bootstrapping with hierarchical random re-sampling and replacement on both VM invocations and execution iterations, following previous studies [12], [13], [8], [14].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence

Zhang¹,

Xing²,

Xia³

et al. 2023

Preprint

View full text Add to dashboard Cite

The usage of Python idioms is popular among Python developers in a formative study of 101 Python idiom performance related questions on Stack Overflow, we find that developers often get confused about the performance impact of Python idioms and use anecdotal toy code or rely on personal project experience which is often contradictory in performance outcomes. There has been no large-scale, systematic empirical evidence to reconcile these performance debates. In the paper, we create a large synthetic dataset with 24,126 pairs of nonidiomatic and functionally-equivalent idiomatic code for the nine unique Python idioms identified in [1], and reuse a large realproject dataset of 54,879 such code pairs provided in [1]. We develop a reliable performance measurement method to compare the speedup or slowdown by idiomatic code against non-idiomatic counterpart, and analyze the performance discrepancies between the synthetic and real-project code, the relationships between code features and performance changes, and the root causes of performance changes at the bytecode level. We summarize our findings as some actionable suggestions for using Python idioms.

show abstract

Section: B Performance Measurementmentioning

confidence: 99%

Section: B Performance Measurementmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence

Zhang¹,

Xing²,

Xia³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…However, researchers have not yet reached a consensus on averaging the speedups of multiple benchmarks. Some 31 endorse the geometric mean, some 24,32 advocate the harmonic mean, and some 33 argue that the choice does not matter. Thus, we show both the geometric and harmonic mean speedup over time in Figure 10.…”

Section: The Mean Speedup Over Timementioning

confidence: 99%

“…The full suite or subsets thereof are also broadly recognized by relevant researchers. [23][24][25][26][27][28] These benchmarks exhibit varying characteristics, as they focus on different application scenarios: arithmetic calibration, HTML template rendering, regular expression processing, object serialization, database accessing and so forth. Some of these benchmarks are Python-dominant, primarily leveraging pure Python code to complete their computational tasks, while some are native-dominant, drawing heavily on functionalities implemented in native code provided by the interpreter or extensions.…”

Section: The Benchmark Suitementioning

confidence: 99%

Python meets JIT compilers: A simple implementation and a comparative evaluation

Zhang,

Xu,

2023

Softw Pract Exp

View full text Add to dashboard Cite

Developing a just‐in‐time (JIT) compiler can be a daunting task, especially for a language as flexible as Python. While PyPy, powered with JIT compilation, can often outperform the official pure interpreter, CPython, by a noteworthy margin, its popularity remains far from comparable to that of CPython due to some issues. Given that an easier‐to‐deploy and better‐compatible JIT compiler would benefit more Python users, we have developed comPyler, a simple JIT compiler functioning as a CPython extension and intended to convert frequently interpreted CPython bytecode into equivalent machine code. Designed with good compatibility in mind, it does not alter CPython's internal data structures or external interfaces. Based on LLVM's mature infrastructure, it can be readily ported to almost all platforms. Compared with CPython, it achieved the highest speedup of 2.205, with an average of 1.093. Despite its relatively limited effect, comPyler incurs low development costs. As a baseline compiler, it also sheds light on the improvement attainable by optimizing solely the overhead of bytecode interpretation. Furthermore, as there is still a dearth of empirical research covering the multitude of JIT compilers available for Python, we have conducted a performance study that examines Jython, IronPython, PyPy, GraalPy, Pyston, Pyjion, and our comPyler. Our research takes into account not only the benchmark speed for various time windows but also the boot latency and memory footprint. Through this comprehensive study, our objective is to assist developers in gaining a better understanding of the effects of distinct JIT compilation techniques and to aid users in making informed decisions when choosing among different Python implementations.

show abstract

Evaluating Performance and Resource Consumption of REST Frameworks and Execution Environments: Insights and Guidelines for Developers and Companies

Meglio,

Libero Lucio Starace

2024

IEEE Access

View full text Add to dashboard Cite

A Rigorous Benchmarking and Performance Analysis Methodology for Python Workloads

Cited by 6 publications

References 24 publications

Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence

Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence

Python meets JIT compilers: A simple implementation and a comparative evaluation

Evaluating Performance and Resource Consumption of REST Frameworks and Execution Environments: Insights and Guidelines for Developers and Companies

Contact Info

Product

Resources

About