How to Measure Energy Consumption in Machine Learning Algorithms

García-Martín, Eva; Lavesson, Niklas; Grahn, Håkan; Casalicchio, Emiliano; Boeva, Veselka

doi:10.1007/978-3-030-13453-2_20

Cited by 15 publications

(10 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Certain errors could exist due to the accuracy of on-chip sensor for instantaneous readings reported in [57]. However, as our aim is to analyze only calculation energy cost, it is the only way for collecting on-chip component energy data [41], [58] and reported to be fairly accurate by NVIDIA (+-5% error rate [59]) and authors in [60], [61]. As the problem does not influence theoretical TOs/FLOPs, we will not discuss the accuracy of hardware energy monitoring.…”

Section: Methods Verificationmentioning

confidence: 99%

“…There are several works investigating the energy consumption quantification and estimation of machine learning energy consumption. In [41], the authors make a survey on machine learning energy consumption estimation approaches that use simulated hardware or performance monitoring counters (PMC). They find processor plays the main role rather than DRAM in the energy consumption of tree-based models with experimentation.…”

Section: Related Energy Quantification Researchmentioning

confidence: 99%

See 1 more Smart Citation

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

Tsourdos

Guo

2024

IEEE Trans. Artif. Intell.

View full text Add to dashboard Cite

Deep Neural Networks (DNN) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DNN models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Current energy consumption measures largely monitor system wide consumption or make linear assumptions of DNN models. The former approach captures other unrelated energy consumption anomalies, whilst the latter does not accurately reflect nonlinear computations. In this paper, we are the first to develop a bottom-up Transistor Operations (TOs) approach to expose the role of non-linear activation functions and neural network structure. As there will be inevitable energy measurement errors at the core level, we statistically model the energy scaling laws as opposed to absolute consumption values. We offer models for both feedforward DNNs and convolution neural networks (CNNs) on a variety of data sets and hardware configurations -achieving a 93.6% -99.5% precision. This outperforms existing FLOPs-based methods and our TOs method can be further extended to other DNN models.Impact Statement-Deep learning is one of the fastest growth areas for computational resources (300,000x from 2012 to 2018, doubling every 3-4 months). Data centres are predicted to dominate over 20% of global energy consumption by 2030. Our proposed TOs model provides developers with a theoretical model to expose the important role of both (1) nonlinear activation functions and (2) DNN model structure in its energy consumption. This enables developers to trade-off between model performance and sustainability with 93.6% -99.5% precision. Due to the consideration of both linear and non-linear operation in TOs, it can to some extent replace FLOPs/MACs count as a more accurate metric of DNN model complexity.

show abstract

Section: Methods Verificationmentioning

confidence: 99%

Section: Related Energy Quantification Researchmentioning

confidence: 99%

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

Tsourdos

Guo

2024

IEEE Trans. Artif. Intell.

View full text Add to dashboard Cite

show abstract

“…The energy consumption is calculated based on the integration of the instantaneous power consumption over time. Certain errors could exist in this value, as the challenge of energy monitoring is mentioned in [21]. According to the design of Intel chips from [35], we use IA energy that all the operations are processed in IA core.…”

Section: Methods Verificationmentioning

confidence: 99%

“…There are several works investigating the energy consumption of machine learning and deep learning models. The authors in [21] review both the theory of achieving energy observation on hardware metrics (e.g. voltage) and several ML energy estimation methods.…”

Section: Related Energy Quantification Researchmentioning

confidence: 99%

A Transistor Operations Model for Deep Learning Energy Consumption Scaling

Li¹,

Tsourdos²,

Guo³

2022

Preprint

View full text Add to dashboard Cite

Deep Learning (DL) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The increasing complexity of DL models and its widespread adoption has led to the energy consumption doubling every 3-4 months. Currently, the relationship between DL model configuration and energy consumption is not well established. Current FLOPs and MACs based methods only consider the linear operations. In this paper, we develop a bottom-level Transistor Operations (TOs) method to expose the role of activation functions and neural network structure in energy consumption scaling with DL model configuration. TOs allows us uncovers the role played by non-linear operations (e.g. division/root operations performed by activation functions and batch normalisation). As such, our proposed TOs model provides developers with a hardware-agnostic index for how energy consumption scales with model settings. To validate our work, we analyse the TOs energy scaling of a feed-forward DNN model set and achieve a 98.2% -99.97% precision in estimating its energy consumption. We believe this work can be extended to any DL model.

show abstract

“…The second challenge is related to efficient resource management in Machine Learning [4]. Indeed, as the size of datasets increases, so do the necessary resources to train the models and the corresponding costs [5], both in terms of infrastructure/services, time and energy. These costs are also higher in data streaming scenarios or in domains with concept drift [6].…”

Section: Introductionmentioning

confidence: 99%