“Solving Circuit Optimisation Problems in Cryptography and Cryptanalysis”

Recently, Nicolas Courtois sent me a revised version of my PRESENT bitslice implementation which improves the representation of the S-Box and hence the performance considerably. A paper describing the techniques used to arrive at this new S-box representation is now available on eprint:

Solving Circuit Optimisation Problems in Cryptography and Cryptanalysis

Nicolas T. Courtois, Daniel Hulme and Theodosis Mourouzis

Abstract: One of the hardest problems in computer science is the problem of gate-eficient implementation. Such optimizations are particularly important in industrial hardware implementations of standard cryptographic algorithms. In this paper we focus on optimizing some small circuits such as S-boxes in cryptographic algorithms. We consider the notion of Multiplicative Complexity, a new important notion of complexity introduced in 2008 by Boyar and Peralta and applied to find interesting optimizations for the S-box of the AES cipher. We applied this methodology to produce a compact implementation of several ciphers. In this short paper we report our results on PRESENT and GOST, two block ciphers known for their exceptionally low hardware cost. This kind of representation seems to be very promising in implementations aiming at preventing side channel attacks on cryptographic chips such as DPA. More importantly, we postulate that this kind of minimality is also an important and interesting tool in cryptanalysis.

Final Version of my PhD thesis

I passed my PhD defence on Friday. I’ve uploaded the final version of my thesis titled Algorithmic Algebraic Techniques and their Application to Block Cipher Cryptanalysis. Not much changed since the last draft, which I also posted on this blog, except a few typos and errors.

Continue reading “Final Version of my PhD thesis”

A Simple Bitslice Implementation of PRESENT

I had a hunch the other day which required $2^{36}$ plaintext—ciphertext pairs to be tested. While the hunch turned out to be wrong, at least I got a simple
bitslice implementation of PRESENT (in pure C99) out of it. The implementation is far from optimal:

  • it is pure C but critical paths should be pushed down to assembly,
  • it doesn’t use SSE2’s 128-bit instructions but it should, and
  • the S-Box is hardly optimised (it is simply ANF).

Still, it is better than what I found on the web via a quick Google search. On my 2.33 Ghz Core2Duo Macbook Pro it seems to run at 28 cycles per byte for long messages (not counting the key schedule). For comparison, I think the current speed of AES in bit slice mode is like 10 cycles per byte on the Core2. If I need some performance in the future, I might sit down and improve the implementation (such that it doesn’t totally suck) but for now I have to attend to other projects.