OpenPGP

One of the nice aspects of my current occupation is that I can type OpenPGP into springerlink’s and Google Scholar’s search boxes and claim that reading every paper I deem interesting is “work”. OpenPGP is the standard which is implemented by programs like GnuPG and PGP for e-mail encryption and digital signatures. The reason I became curious is because I wanted to implement something like an OpenPGP encrypted wiki or filesystem for multiple users.

  • RFC 4880 is the current revision of the OpenPGP message format standard, addressing some of the security concerns mentioned below. It replaces RFC 1991 and RFC 2440. You have to admire that they got their hands on 4880 to replace 2440.
  • Can We Trust Cryptographic Software? Cryptographic Flaws in GNU Privacy Guard v1.2.3 by Phong Nguyen describes a few errors made in the GnuPG implementation of the OpenPGP standard. In particular some parameters used in ElGamal encryption were chosen to be small for performance reasons which allowed lattice-based attacks. Lattice-based attacks on small parameters in public-key cryptography are not new, another example is textbook RSA with say 512-bit modulus encrypting a DES 56-bit key. From the paper: “If a proprietary software claims to implement 2048-bit RSA and 128-bit AES, it does not say much about the actual cryptographic security: which RSA is being used? Could it be textbook RSA (with zero-padding) encrypting a 128-bit AES key with public exponent 3? … Open source software thus sounds like a good solution. However, the fact that a source code can be read does not necessarily imply that it is actually read, especially by cryptography experts.” The flaw was rather serious (one package was sufficient to compute the private key) but the required configuration fortunately not very wide-spread since it was never the default choice. The particular option was removed from GnuPG since then.
  • An Attack on CFB Mode Encryption as Used by OpenPGP by Serge Mister and Robert Zuccherato describes an attack on the ad-hoc modification of CFB mode. PGP does not use variable IVs but instead encrypts a random block first and then two bytes which repeat two bytes from the first block. This redundancy provides a “quick check” whether the correct symmetric key was used for decryption or not. This also instantiates an integrity-check oracle if the information whether decryption passed this test or not is made available to the attacker. She can use this oracle to decrypt two bytes from any ciphertext block. The setup costs $2^{15}$ oracle queries and each block also costs $2^{15}$ oracle queries on average. RFC4880 discourages the use of this “quick check” and I think GnuPG avoids it.
  • Adaptive-CCA on OpenPGP Revisited by Hsi-Chung Lin, Sung-Ming Yen and Guan-Ting Chen does what the title implies. It revisits older adaptive CCA attacks and evaluates their applicability to RFC2440, also some new adaptive CCA attacks with weaker assumptions are proposed. All these attacks should not apply against RFC4880 anymore.
  • Privacy in Encrypted Content Distribution Using Private Broadcast Encryption by Adam Barth, Dan Boneh and Brent Waters is not really about OpenPGP. The authors construct a system where content is distributed in encrypted form but no one can tell who is a recipient not even other recipients: private broadcast encryption. OpenPGP does not provide this feature, as pointed out in the paper. While it allows to remove the explicit tag for which key a packet is encrypted, it chooses random Diffie-Hellmann groups for each key and thus still allows to break privacy (by distinguishing groups). While this could easily be fixed too, the authors also consider active attacks where an attacker modifies an encrypted message for Alice to contain the text “please visit the following URL for free music” (that really is their example!). The attacker then waits for Alice to click on the link which can only happen if she could decrypt the original message.

The bottom line is that OpenPGP features some ad-hoc cryptography which is not up to the standards of the cryptography research community. For example, OpenPGP is most definitely not secure against chosen-ciphertext attacks (CCA). This is likely not an issue for e-mail security where a human being enters passphrases to unlock private keys and where reports of errors are not relayed to a potential attacker. However, for instance a server which automatically decrypts messages and acts based on the content of the cleartext is a whole different story … and so is my OpenPGP encrypted wiki thingy.

PS: I will be at the ECRYPT-II Workshop on Cryptology: Progress and Challenges in Leuven in two weeks.

Advertisements

SAT Solving Pointers

This is just a quick note to point out two SAT-solving sources relevant for cryptography.

Have fun.

Algebraic Attacks and CNF

Since the seminal papers [1] and [2] by Bard, Courtois and Jefferson it seems accepted wisdom that the right thing to do for constructing a CNF representation of a block cipher is to construct an algebraic system of equations first (cf. [3]). This system of equations is then converted to CNF using some ANF to CNF converted (e.g. [4]) which deals with the negative impact of the XORs just introduced via the ANF. On the other hand, it is straight forward to compute some CNF for a given S-Box directly by considering its truth table. Sage now contains code which does this for you:

sage: sr = mq.SR(1,1,1,4,gf2=True,polybori=True)
sage: S = sr.sbox()
sage: print S.cnf()

[(1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7),(1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3,4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1,2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4, 6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8), (1, 2, 3, 4, 5), (1, 2, 3, 4,6), (1, 2, 3, 4, 7), (1, 2, 3, 4, 8)]

I am not claiming that this naive approach produces an optimal representation, it seems more compact than what ANF to CNF converters produce, though.

Geometric XL

The paper by Sean Murphy and Maura Paterson has been around for quite some time now (same for my toy implementation). Gröbner basis algorithms and related methods like XL are algebraic in nature. In particular, their complexity is not invariant under a linear change of coordinates. For example consider Cyclic-6

    sage: P.<a,b,c,d,e,f,h> = PolynomialRing(GF(32003))
    sage: I = sage.rings.ideal.Cyclic(P,6).homogenize(h)
    sage: J = Ideal(I.groebner_basis())

The generators of J are a Gröbner basis and we can use this property to find a common root for these generators. Now, consider the same equations but permute the variables in the ring.

    sage: P.<a,b,c,d,e,f,h> = PolynomialRing(GF(32003),order='lex')
    sage: I = sage.rings.ideal.Cyclic(P,6).homogenize(h)
    sage: J = Ideal(I.groebner_basis())
    sage: R = PolynomialRing(GF(32003),P.ngens(),list(reversed(P.variable_names())),order='lex')
    sage: H = Ideal([R(str(f)) for f in J.gens()])

The generators of H do not form a Gröbner basis in R which is just P with its variables reversed. If we are only trying to solve a system of equations choosing the right permutation of variables might have a significant impact on the performance of our Gröbner basis algorithm:

    sage: t = cputime()
    sage: gb = H.groebner_basis('libsingular:std')
    sage: gb[-1].degree()
    19
    sage: cputime(t) # output random-ish
    25.36...

While in this example it is easy to see which variable permutation is the cheapest one, this is not necessarily the case in general. The GeometricXL algorithm is invariant under any linear change of coordinates and has the following property:

Let D be the degree reached by the XL algorithm to solve a given system of equations under the optimal linear change of coordinates. Then GeometricXL will also solve this system of equations for the degree D, without applying this optimal linear change of coordinates first.

The above behaviour holds under two assumptions:

  • the characteristic of the base field K is bigger than D and
  • the system of equations has one over “very few” solution.

To demonstrate this behaviour, consider the synthetic benchmark available here which is a Gröbner basis under a linear change of coordinates:

 

    sage: e,h = random_example(n=6)

e is the original easy system while h is the “rotated” system

    sage: e.basis_is_groebner()
    True

    sage: max([f.total_degree() for f in e.gens()])
    2

    sage: h.basis_is_groebner()
    False

    sage: max([f.total_degree() for f in h.gens()])
    2

GeometricXL recovers linear factors and thus candidates for common roots at D=2:

    sage: hH = h.homogenize()
    sage: f = GeometricXL(hH, D=2); f.factor(False)
    0.0...s -- 1. D: 2
    ...
    (-2684)
    * (-1056*x5 - 2964*x4 - 177*x3 + 6206*x2 + 376*x1 + 6257*x0 + h)
    * (2957*x5 - 792*x4 - 4323*x3 - 14408*x2 - 2750*x1 - 8823*x0 + h)

While any Gröbner basis algorithm would have to reach at least degree 64:

    sage: gb = h.groebner_basis('libsingular:slimgb')
    sage: gb[-1].degree()
    64

While my toy implementation is neither robust not efficient, the following table (using the same synthetic benchmark as above) should give some indication that for some problems GeometricXL is a good choice:

Synthetic Benchmark for GeometricXL Algorithm over GF(32003)
n GeometricXL degree GeometricXL time in seconds Singular 3-0-4 time in seconds Magma 2.14 time in seconds Gröbner basis degree
2 2 0.01 0.00 0.00 4
3 2 0.07 0.00 0.00 8
4 2 0.18 0.00 0.00 16
5 2 0.37 0.02 0.01 32
6 2 0.73 0.12 0.04 64
7 2 1.36 0.94 0.09 128
8 2 2.35 9.59 0.56 256
9 2 4.24 117.43 3.79 512
10 2 7.34 28.68 1024
11 2 12.08 205.05 2048
12 2 20.03
13 2 35.53
14 2 53.98
15 2 88.36
16 2 143.16

While the benchmark is synthetical and unrealistic, there are a few problems in multivariate quadratic public key cryptography which might be tackled using an idea similar to the GeometricXL algorithm. Also, note that constructing a GeometricMatrixF5 algorithm is straight forward by replacing the first step of the algorithm.

A Simple Bitslice Implementation of PRESENT

I had a hunch the other day which required $2^{36}$ plaintext—ciphertext pairs to be tested. While the hunch turned out to be wrong, at least I got a simple
bitslice implementation of PRESENT (in pure C99) out of it. The implementation is far from optimal:

  • it is pure C but critical paths should be pushed down to assembly,
  • it doesn’t use SSE2’s 128-bit instructions but it should, and
  • the S-Box is hardly optimised (it is simply ANF).

Still, it is better than what I found on the web via a quick Google search. On my 2.33 Ghz Core2Duo Macbook Pro it seems to run at 28 cycles per byte for long messages (not counting the key schedule). For comparison, I think the current speed of AES in bit slice mode is like 10 cycles per byte on the Core2. If I need some performance in the future, I might sit down and improve the implementation (such that it doesn’t totally suck) but for now I have to attend to other projects.