Cryptanalysis of my Somewhat Homomorphic PollyCracker Scheme

About 5 months ago I wrote about a homomorphic scheme based on integers and an adaptation of this schemes to use multivariate polynomials. At the very end I wrote: 

“Of course, I assume that my adaptation above can still be broken somehow since that’s what tends to happen with multivariate crypto schemes. Also, I’m really really not an expert on public-key cryptography. Hey, this is a blog post, not a research paper … so break it in the comments :)

Well, with some delay, here is how to break it. Recall, that the secret is some Gröbner basis g_0,\dots,g_{n-1} in \mathbb{F}[x_0,\dots,x_{n-1}] (assume \mathbb{F}_2 for simplicity) and that a ciphertext is constructed as

c = \sum_{i=0}^{n-1} f_ig_i + m' + m

where m' has constant coefficient 0 and f_i are random polynomials. Now, let \deg(f_i) = Q, \deg(g_i) = P, \deg(m') = N with N < P to ensure correct decryption (i.e. no interference of the noise with the normal form computation).

To break the scheme simply request n^(P+Q) encryptions of zero and compute the row echelon form on the linearisation of those polynomials of degree P+Q without elimination above the pivot (i.e. don’t compute the reduced row echelon form). Throw away any element of degree \leq N and call the resulting list S. This list S allows to decrypt any ciphertext c by reducing c modulo S.

Below, the attack in Sage:

class PolynomialHomomorphic:
    def __init__(self, l):
        self.l = l
        K = GF(2) # pick some field
        # we choose a ring with l unknowns, which
        # should make any GB computation at least
        # as hard as 2^l if we pick the ideal sufficiently
        # random.
        R = PolynomialRing(K, l, 'x', order='degrevlex')
        self.K = K
        self.R = R
        # our small ideal, defines the message space.
        self.b = Ideal([x for x in R.gens()])

        # these parameters are pretty arbitrary from a
        # security perspective!
        self.N = 1
        self.P = 2
        self.Q = 1
        self.key_gen()

    def key_gen(self):
        b,l = self.b, self.l
        K, R = self.K, self.R
        # we pick a random ideal with l element of degree P
        p = [R.gen(i)**self.P + R.random_element(degree=self.P-1)
              for i in range(l)]
        self.p = Ideal(p)

    def encrypt(self, m):
        # we pick some m' which encodes the
        # same plaintext but is bigger
        m = self.R(m)
        mprime = self.R.random_element(degree=self.N)
        mprime -= mprime.constant_coefficient()
        mprime += m.constant_coefficient()

       # adding a random ideal element
        for f in self.p.gens():
            mprime += self.R.random_element(degree=self.Q)*f
        return mprime

    def decrypt(self, c):
        # decryption is just as in the integer case.
        return c.reduce(self.p).reduce(self.b)

def break_cpa(pc, challenge):
    ciphertexts = [pc.encrypt(0) for _ in xrange(2*pc.l**(pc.Q+pc.P))]
    F = Sequence(ciphertexts)
    A,v = F.coefficient_matrix(sparse=False)
    E = A.echelon_form(full=False)
    s = dict([(f.lm(),f) for f in (E*v).list() if f.degree() >= pc.P])

    while True:
        print challenge
        if challenge.lm() in s:
            challenge = challenge - s[challenge.lm()]
        else:
            if challenge.degree() > pc.P:
                raise ValueError("Ah, snap! It didn't work.")
            else:
                return challenge.constant_coefficient()

This works because the noise is constructed in such a way as to never interfere with elimination: it does not affect any leading monomial of the ideal ever. Thus, we don’t need to consider it during the elimination and simply ignore the lower terms once we are done.

Note that this attack does not work against the integer homomorphic scheme by van Dijk et al. because there additions are not free: When we perform elimination of the higher order bits in the integer scheme also the noise accumulates; eventually it surpasses the size of the prime p leaving us with no information. Put differently, we can attack schemes that are inspired by the integer-based scheme if additions are free. Thus, while it might seem tempting to replace integers by, say, univariate polynomials which are often considered essentially computationally equivalent to integers and which would provide free additions  it would break the security of the scheme.

Somewhat Homomorphic Encryption

Today I read the paper “Computing Arbitrary Functions of Encrypted Data” by Craig Gentry in which he explains the basic ideas behind his work on fully homomorphic encryption. If you don’t know what homomorphic encryption is: it means that one can evaluate a function on ciphertexts which has the same result as evaluating that function on the plaintexts. Thus, one can compute with data without getting to see it. The canonical example Gentry uses in his papers is a search engine that searches through sensitive data without seeing the actual content. It is similar to secure multi-party computation except that it is not interactive at all, i.e. you send your query and get the result back without any interaction in between. In the above mentioned paper Gentry gives a simplified version of a recent integer based fully homomorphic scheme. This simplified version is symmetric and only somewhat homomorphic (more on this below). Since the basic idea of the scheme is so simple, it is quickly implemented. The code below is actually some slight generalisation such that b isn’t always forced to be two.

class IntegerHomomorphic:
    def __init__(self, l, b=2):
        self.b = b
        self.l = l  # lambda is out security parameter
                    # these parameters determine the size
                    # of various numbers and are chosen such that
                    # the approximate GCD is assumed to be hard.
        self.N = 2**(l)
        self.P = 2**(l**2)
        self.Q = 2**(l**5)
        self.key_gen()

    def key_gen(self):
        # the secret key is simply an odd number of P bits.
        p = ZZ.random_element(0, self.P-1)
        if p%self.b == 0:
            p += ZZ.random_element(1, self.b)
        self.p = p

    def encrypt(self, m):
        # we want to encrypt 0 <= m < b, thus we pick some m'
        # for which m' % b == m % b holds to avoid having
        # known bits (== 0) in the computation below.
        mprime = ZZ.random_element(0,self.N-1)
        mprime = mprime - mprime%self.b + m%self.b
        # the ciphertext is this m' with some multiple of p
        # added to mask it.
        q = ZZ.random_element(0,self.Q-1)
        return mprime + self.p*q

    def decrypt(self, c):
        # if we now p decryption is easy since we can simply
        # mod out p and then recover the value mod b
        return ZZ((c % self.p) % self.b)

To see that this scheme is somewhat homomorphic observe that as long as the noise is small enough such that things don’t wrap around we have and for some number .

Here is a toy example:

sage: IH = IntegerHomomorphic(3,)
sage: IH.encrypt(0)
1269419566719153208598601566220616500605965999\
730518673686683244436179224382
sage: IH.encrypt(0) + IH.encrypt(1)
61002356472659294938919367702545237501719846898\
35782651710838758570047089506
sage: IH.decrypt(IH.encrypt(0) + IH.encrypt(1))
1
sage: IH.decrypt(IH.encrypt(0) + IH.encrypt(1)*IH.encrypt(1))
1

However, there is only a limited number of operations we can perform before we overflow, hence the “somewhat homomorphic” in the title:

sage: IH.decrypt(prod(IH.encrypt(1) for _ in range(1000)))
0
sage: IH.decrypt(prod(IH.encrypt(1) for _ in range(1000)))
1

The security of this construction can be reduced (according to the paper, I didn’t check it) to the approximate GCD problem, that is: given and , compute or in and . The problem hasn’t received much attention so far. Currently it is assumed to be hard if and have bits, has bits and has bits.

Thus, the size of the scheme is both bounded by the gap between and which dictates how much computations one can fit before wrapping around and the sizes required for the approximate GCD to be hard. Of course, if you spent more than three years starring at multivariate polynomial ideals you will have noticed by now that this whole scheme can be instantiated using multivariate polynomial ideals as well:

class PolynomialHomomorphic:
    def __init__(self, l, b=2):
        self.l = l
        K = GF(127)  # pick some field
        # we choose a ring with l unknowns, which
        # should make any GB computation at least
        # as hard as 2\^l if we pick the ideal sufficiently
        # random.
        R = PolynomialRing(K, l, 'x', order='degrevlex')
        self.K = K
        self.R = R
        # our small ideal, defines the message space.
        self.b = Ideal([x for x in R.gens()])

    # these parameters are pretty arbitrary from a
    # security perspective!
        self.N = 1
        self.P = 2
        self.Q = 5
        self.key_gen()

    def key_gen(self):
        b, l = self.b, self.l
        K, R = self.K, self.R
        # we pick a random ideal with l element of degree P
        p = [R.random_element(degree=self.P) for _ in range(l)]
        self.p = Ideal(p)

    def encrypt(self, m):
        # we pick some m' which encodes the
        # same plaintext but is bigger
        m = self.R(m)
        mprime = self.R.random_element(degree=self.N)
        mprime -= mprime - mprime.reduce(self.b)
        mprime += m.reduce(self.b)

        # adding a random ideal element
        for f in self.p.gens():
            mprime += self.R.random_element(degree=self.Q)*f
        return mprime

        def decrypt(self, c):
            # decryption is just as in the integer case.
            return c.reduce(self.p).reduce(self.b)

This scheme looks a bit similar to PolyCracker where the security was also related to normal form computations. PolyCracker was broken because normal forms are simpler than Gröbner bases, that is you don’t necessarily need a Gröbner basis to compute them. However, in this scheme you must find the ideal first, before you can start thinking about whether you need a Gröbner basis of it or not. Of course, I assume that my adaptation above can still be broken somehow since that’s what tends to happen with multivariate crypto schemes. Also, I’m really really not an expert on public-key cryptography. Hey, this is a blog post, not a research paper … so break it in the comments 🙂

Anyway, here’s a toy example:

sage: PH = PolynomialHomomorphic(2)
sage: f = PH.encrypt(3) * PH.encrypt(2) * PH.encrypt(1) + PH.encrypt(2)
sage: PH.decrypt(f)
8

Of course, this is just the first step. The really cool stuff happens after this simple step. Gentry constructs a fully homomorphic scheme from such a somewhat homomorphic scheme … if you want to know how, go read the paper 🙂

Update: See this post for a break of this scheme and corrected Sage code.