Substitution boxes (S-boxes) are essential components of many cryptographic primitives. The Dijkstra algorithm, SAT solvers, and heuristic methods have been used to find bitsliced implementations of S-boxes. However, it is difficult to apply these methods for 8-bit S-boxes because of their size. Therefore, in order to implement these S-boxes so that the countermeasure of side-channel attack can be applied efficiently, using structures such as Feistel, Lai-Massey, and MISTY that can be bitsliced implemented with a small number of nonlinear operations has been widely used. Since S-boxes constructed with structures consist of small S-boxes and have specific designs, there are limitations to their cryptographic security and efficiency. In this paper, we propose a new method for generating S-boxes by stacking bitwise operations from the identity function, an approach which is different from existing methods. This method can be expressed in Markov decision process, and reinforcement learning is a suitable solver for Markov decision process. Our goal is to train this method to an agent through reinforcement learning to generate S-boxes to which the masking scheme, which is a countermeasure of side-channel attack, can be efficiently applied. In particular, our method provided various S-boxes superior or comparable to existing S-boxes. We produced 8-bit S-boxes with differential uniformity 16 (resp. 32) and linearity 128 (resp. 128), generated with nine (resp. eight) nonlinear operations, for the first time. To our best knowledge, this is the first study to construct cryptographic S-Box by incorporating reinforcement learning.
INDEX TERMSS-box, Masking efficiency, Reinforcement learning, Bitsliced implementation, Linearity, Differential uniformity I. INTRODUCTION To apply security applications to mobile and embedded platforms, lightweight and efficient cryptographic primitives are required. Substitution boxes (S-boxes) are representative nonlinear functions, giving cryptographic primitives Shannon's confusion property [1]. Therefore, finding S-boxes with sufficient security and efficiency is an important issue for the designers of cryptographic algorithms. Side-channel attacks, first published by Kocher in 1996 [2], can draw secret information of cryptographic algorithms from side-channel leakages such as electromagnetic emissions and power consumption. Since mathematical cryptanalysis alone cannot guarantee the security of cryptographic primitives against side-channel attacks, various counter measures have been proposed. Techniques to randomize the intermediate values of ciphers are widely used, and among them, a higher-order Boolean masking techniqueis the most popular approach.
18The method of implementing a S-box as Table Look Up 19 (TLU) requires table storage space, and especially when ap-20 plying the masking scheme, even more flip-flops are required.