2020
DOI: 10.46586/tches.v2021.i1.25-55
|View full text |Cite
|
Sign up to set email alerts
|

Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis

Abstract: The side-channel community recently investigated a new approach, based on deep learning, to significantly improve profiled attacks against embedded systems. Compared to template attacks, deep learning techniques can deal with protected implementations, such as masking or desynchronization, without substantial preprocessing. However, important issues are still open. One challenging problem is to adapt the methods classically used in the machine learning field (e.g. loss function, performance metrics) to the spe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
24
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 27 publications
(24 citation statements)
references
References 25 publications
0
24
0
Order By: Relevance
“…Compared to the interpretation of deep learning based side-channel analysis as a classification problem that is more popular in the literature on deep learning based side-channel analysis (see e.g. [24,26] and the references cited therein), this has the advantage that the values we are trying to predict are in fact not categorical labels, but numbers. 5 By using a network that is a numerical regressor instead of a classifier, we can use that information to judge the severity of prediction errors, to alleviate the problem of rare output classes, and to induce an inductive bias towards mapping similar power traces to nearby Hamming weight estimates.…”
Section: Leakage Modelmentioning
confidence: 99%
“…Compared to the interpretation of deep learning based side-channel analysis as a classification problem that is more popular in the literature on deep learning based side-channel analysis (see e.g. [24,26] and the references cited therein), this has the advantage that the values we are trying to predict are in fact not categorical labels, but numbers. 5 By using a network that is a numerical regressor instead of a classifier, we can use that information to judge the severity of prediction errors, to alleviate the problem of rare output classes, and to induce an inductive bias towards mapping similar power traces to nearby Hamming weight estimates.…”
Section: Leakage Modelmentioning
confidence: 99%
“…They further adapted the CER metric to a new loss function to adopt in the training process, which can yield enhanced attack performance when dealing with imbalanced data. Another innovative loss function, Ranking Loss (RkL), was proposed by adapting the "Learning to Rank" approach in the Information Retrieval field to the side-channel context [17], which helps prevent approximation and estimation errors, induced by the typical cross-entropy loss.…”
Section: A Related Workmentioning
confidence: 99%
“…MLPs joint the features in the dense layers and accordingly imitate the effect of higher-order attacks, hence standing as a natural choice when dealing with masking countermeasures [11], [12]. On the other hand, CNNs have the property of spatial invariance, bringing about intrinsic resilience against trace misalignment [13]- [17]. Furthermore, the internal architecture of CNNs make them adequate for automatical feature extraction from high-dimensional data, which eliminates the demand for preprocessing.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, launching practical attacks and averaging the key rank to estimate the guessing entropy is computationally costly (especially if also done during the training phase, see, e.g., [14,19]). Consequently, the SCA community put a significant attention on developing SCA-specific metrics and loss functions, such as Cross-Entropy Ratio loss (CER) [30] and ranking loss (RKL) [28].…”
Section: Introductionmentioning
confidence: 99%