Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis

Zaid, Gabriel; Bossuet, Lilian; Dassance, François; Habrard, Amaury

doi:10.46586/tches.v2021.i1.25-55

Cited by 27 publications

(24 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Compared to the interpretation of deep learning based side-channel analysis as a classification problem that is more popular in the literature on deep learning based side-channel analysis (see e.g. [24,26] and the references cited therein), this has the advantage that the values we are trying to predict are in fact not categorical labels, but numbers. 5 By using a network that is a numerical regressor instead of a classifier, we can use that information to judge the severity of prediction errors, to alleviate the problem of rare output classes, and to induce an inductive bias towards mapping similar power traces to nearby Hamming weight estimates.…”

Section: Leakage Modelmentioning

confidence: 99%

Breaking Masked Implementations of the Clyde-Cipher by Means of Side-Channel Analysis

Gohr¹,

Laus

Schindler

2022

TCHES

View full text Add to dashboard Cite

In this paper we present our solution to the CHES Challenge 2020, the task of which it was to break masked hardware respective software implementations of the lightweight cipher Clyde by means of side-channel analysis. We target the secret cipher state after processing of the first S-box layer. Using the provided trace data we obtain a strongly biased posterior distribution for the secret-shared cipher state at the targeted point; this enables us to see exploitable biases even before the secret sharing based masking. These biases on the unshared state can be evaluated one S-box at a time and combined across traces, which enables us to recover likely key hypotheses S-box by S-box.In order to see the shared cipher state, we employ a deep neural network similar to the one used by Gohr, Jacob and Schindler to solve the CHES 2018 AES challenge. We modify their architecture to predict the exact bit sequence of the secret-shared cipher state. We find that convergence of training on this task is unsatisfying with the standard encoding of the shared cipher state and therefore introduce a different encoding of the prediction target, which we call the scattershot encoding. In order to further investigate how exactly the scattershot encoding helps to solve the task at hand, we construct a simple synthetic task where convergence problems very similar to those we observed in our side-channel task appear with the naive target data encoding but disappear with the scattershot encoding.We complete our analysis by showing results that we obtained with a “classical” method (as opposed to an AI-based method), namely the stochastic approach, thatwe generalize for this purpose first to the setting of shared keys. We show that the neural network draws on a much broader set of features, which may partially explain why the neural-network based approach massively outperforms the stochastic approach. On the other hand, the stochastic approach provides insights into properties of the implementation, in particular the observation that the S-boxes behave very different regarding the easiness respective hardness of their prediction.

show abstract

Section: Leakage Modelmentioning

confidence: 99%

Breaking Masked Implementations of the Clyde-Cipher by Means of Side-Channel Analysis

Gohr¹,

Laus

Schindler

2022

TCHES

View full text Add to dashboard Cite

show abstract

“…They further adapted the CER metric to a new loss function to adopt in the training process, which can yield enhanced attack performance when dealing with imbalanced data. Another innovative loss function, Ranking Loss (RkL), was proposed by adapting the "Learning to Rank" approach in the Information Retrieval field to the side-channel context [17], which helps prevent approximation and estimation errors, induced by the typical cross-entropy loss.…”

Section: A Related Workmentioning

confidence: 99%

“…MLPs joint the features in the dense layers and accordingly imitate the effect of higher-order attacks, hence standing as a natural choice when dealing with masking countermeasures [11], [12]. On the other hand, CNNs have the property of spatial invariance, bringing about intrinsic resilience against trace misalignment [13]- [17]. Furthermore, the internal architecture of CNNs make them adequate for automatical feature extraction from high-dimensional data, which eliminates the demand for preprocessing.…”

Section: Introductionmentioning

confidence: 99%

Towards Strengthening Deep Learning-based Side Channel Attacks with Mixup

Luo

Zheng

Wang

et al. 2021

2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)

View full text Add to dashboard Cite

In recent years, various deep learning techniques have been exploited in side channel attacks, with the anticipation of obtaining more appreciable attack results. Most of them concentrate on improving network architectures or putting forward novel algorithms, assuming that there are adequate profiling traces available to train an appropriate neural network. However, in practical scenarios, profiling traces are probably insufficient, which makes the network learn deficiently and compromises attack performance.In this paper, we investigate a kind of data augmentation technique, called mixup, and first propose to exploit it in deeplearning based side channel attacks, for the purpose of expanding the profiling set and facilitating the chances of mounting a successful attack. We perform Correlation Power Analysis for generated traces and original traces, and discover that there exists consistency between them regarding leakage information. Our experiments show that mixup is truly capable of enhancing attack performance especially for insufficient profiling traces. Specifically, when the size of the training set is decreased to 30% of the original set, mixup can significantly reduce acquired attacking traces. We test three mixup parameter values and conclude that generally all of them can bring about improvements. Besides, we compare three leakage models and unexpectedly find that least significant bit model, which is less frequently used in previous works, actually surpasses prevalent identity model and hamming weight model in terms of attack results.

show abstract

“…On the other hand, launching practical attacks and averaging the key rank to estimate the guessing entropy is computationally costly (especially if also done during the training phase, see, e.g., [14,19]). Consequently, the SCA community put a significant attention on developing SCA-specific metrics and loss functions, such as Cross-Entropy Ratio loss (CER) [30] and ranking loss (RKL) [28].…”

Section: Introductionmentioning

confidence: 99%

Focus is Key to Success: A Focal Loss Function for Deep Learning-Based Side-Channel Analysis

Kerkhof

Perin

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The deep learning-based side-channel analysis represents one of the most powerful side-channel attack approaches. Thanks to its capability in dealing with raw features and countermeasures, it becomes the de facto standard approach for the SCA community. The recent works significantly improved the deep learning-based attacks from various perspectives, like hyperparameter tuning, design guidelines, or custom neural network architecture elements. Still, insufficient attention has been given to the core of the learning process -the loss function.This paper analyzes the limitations of the existing loss functions and then proposes a novel side-channel analysis-optimized loss function: Focal Loss Ratio (FLR), to cope with the identified drawbacks observed in other loss functions. To validate our design, we 1) conduct a thorough experimental study considering various scenarios (datasets, leakage models, neural network architectures) and 2) compare with other loss functions used in the deep learning-based side-channel analysis (both "traditional" ones and those designed for side-channel analysis). Our results show that FLR loss outperforms other loss functions in various conditions while not having computational overhead like some recent loss function proposals.

show abstract

Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis

Cited by 27 publications

References 25 publications

Breaking Masked Implementations of the Clyde-Cipher by Means of Side-Channel Analysis

Breaking Masked Implementations of the Clyde-Cipher by Means of Side-Channel Analysis

Towards Strengthening Deep Learning-based Side Channel Attacks with Mixup

Focus is Key to Success: A Focal Loss Function for Deep Learning-Based Side-Channel Analysis

Contact Info

Product

Resources

About