On dual lattice attacks against small-secret LWE and parameter choices in HElib and SEAL

My paper on solving small, sparse secret instances is now on ePrint. Here’s the abstract:

We present novel variants of the dual-lattice attack against LWE in the presence of an unusually short secret. These variants are informed by recent progress in BKW-style algorithms for solving LWE. Applying them to parameter sets suggested by the homomorphic encryption libraries HElib and SEAL yields revised security estimates. Our techniques scale the exponent of the dual-lattice attack by a factor of $(2\,L)/(2\,L+1)$ when $\log q = \Theta{\left(L \log n\right)}$, when the secret has constant hamming weight $h$ and where $L$ is the maximum depth of supported circuits. They also allow to half the dimension of the lattice under consideration at a multiplicative cost of $2^{h}$ operations. Moreover, our techniques yield revised concrete security estimates. For example, both libraries promise 80 bits of security for LWE instances with $n=1024$ and $\log_2 q \approx {47}$, while the techniques described in this work lead to estimated costs of 68 bits (SEAL) and 62 bits (HElib).

If you want to see what its effect would be on your favourite small, sparse secret instance of LWE, the code for estimating the running time is included in our LWE estimator. The integration into the main function estimate_lwe is imperfect, though. To get you started, here’s the code used to produce the estimates for the rolling example in the paper.

• Our instance’s secret has hamming weight $h=64$ and a ternary secret. We always use sieving as the SVP oracle in BKZ:

sage: n, alpha, q = fhe_params(n=2048, L=2)
sage: kwds = {"optimisation_target": "sieve", "h":64, "secret_bounds":(-1,1)}

• We establish a base line:

sage: print cost_str(sis(n, alpha, q, optimisation_target="sieve"))

• We run the scaled normal form approach from Section 4 and enable amortising costs from Section 3 by setting use_lll=True:

sage: print cost_str(sis_small_secret_mod_switch(n, alpha, q, use_lll=True, **kwds))

• We run the approach from Section 5 for sparse secrets. Setting postprocess=True enables the search for solutions $\mathbf{s}_1$ with very low hamming weight (page 17):

sage: print cost_str(drop_and_solve(sis, n, alpha, q, postprocess=True, **kwds))

• We combine everything:

sage: f = sis_small_secret_mod_switch
sage: print cost_str(drop_and_solve(f, n, alpha, q, postprocess=True, **kwds))


GSW13: 3rd Generation Homomorphic Encryption from Learning with Errors

This week our reading group studied Homomorphic Encryption from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based by Craig Gentry, Amit Sahai and Brent Waters: a 3rd generation fully homomorphic encryption scheme.

The paper is partly motivated by that multiplication in previous schemes was complicated or at least not natural. Let’s take the BGV scheme where ciphertexts are simply LWE samples $a_i, b_i = -\langle a_i, s\rangle + \mu_i \cdot \lceil q/2\rfloor + e_i$ for $a_i \in \mathbb{Z}_q^n$ and $b_i \in \mathbb{Z}_q$ with $\mu_i$ being the message bit $\in \{0,1\}$ and $e_i$ is some “small” error. Let’s write this as $c_i = (a_i, b_i) \in \mathbb{Z}_q^{n+1}$ because it simplifies some notation down the line. In this notation, multiplication can be accomplished by $c_1 \otimes c_2$ because $\langle c_1 \otimes c_2, s \otimes s\rangle \approx \mu_1 \cdot \mu_2$. However, we now need to map $s \otimes s$ back to $s$ using “relinearisation”, this is the “unnatural” step.

However, this is only unnatural in this particular representation. To see this, let’s rewrite $a_i, b_i$ as a linear multivariate polynomial $f_i = b_i - \sum_{j=1}^n a_{ij} \cdot x_j \in \mathbb{Z}_q[x_1,\dots,x_n]$. This polynomial evaluates to $\approx \mu$ on the secret $s = (s_1,\dots,s_n)$. Note that evaluating a polynomial on $s$ is the same as reducing it modulo the set of polynomials $G = (x_1 - s_1,\dots, x_n - s_n)$.

Cryptanalysis of the FHE based on GACD?

Jintai Ding and Chengdong Tao published a new preprint on the IACR’s ePrint titled A New Algorithm for Solving the Approximate Common Divisor Problem and Cryptanalysis of the FHE based on GACD.

*Abstract. *In this paper, we propose a new algorithm for solving the approximate common divisors problems, which is based on LLL reduction algorithm of certain special lattice and linear equation solving algorithm over integers. Through both theoretical argument and experimental data, we show that our new algorithm is a polynomial time algorithm under reasonable assumptions on the parameters. We use our algorithm to solve concrete problems that no other algorithm could solve before. Further more, we show that our algorithm can break the fully homomorphic encryption schemes, which are based on the approximate common divisors problem, in polynomial time in terms of the system parameter λ.

It is worth emphasising that the Approximate GCD problem not only underpinsone of the few fully homomorphic encryption schemes we have but it is also somewhat related to one of two candidates for multilinear maps. So if it could be shown to be easy then this would be somewhat sad as the choice of problems for building fancy crypto schemes would have gotten a lot smaller. So what is the Approxmiate GCD problem?

Approximate Greatest Common Divisions Problem: Given polynomially many samples x_{i} = q_{i}· p + r_{i} where x_{i} = O(2^{γ}), r_{i} = O(2^{ρ}) and p = O(2^{η}), recover p.

The algorithm proceeds by using the LLL algorithm to find relations .

Note that if enough such relations can be found then this gives a linear system of equations in the r_{j} which we’d only need to solve. So how does the algorithm use LLL to recover the a_{ij}? It sets up a lattice basis of dimension (t+1) × (t+1) as follows:

Here, N is simply a random integer O(2^{γ}). Now, the authors claim that running LLL on the lattice spanned by B returns about t-2 of the desired relations. They are unable to rigorously argue why this should happen but offer the following intuition. Any  vector  in the lattice spanned by B has the form . Considering the last component = = they speculate that that LLL would focus on the left hand side of this expression as the right hand side would be rather small anyway. Making implies which in turn implies , if I understood correctly.

An implementation of the first step of this algorithm for Sage is given here (in a Sage cell). Indeed, if you plug in the parameters from the authors’ table on page 9, we do get our desired relations out.

Finally, let’s look at the application to parameters as they are used in cryptography. The authors consider the fully homomorphic encryption scheme due to Marten van Dijkm Craig Gentry, Shai Halevi, Vinod Vaikuntanathan which sets γ = λ^{5}, η = λ^{2} and ρ = λ for a security level of λ, i.e. 2^{λ} operations should be needed to break it. Here is what the authors write:

We apply our algorithm to the parameters in [ 6 ] and we could break all the cases where their parameter γ < 2^{20}.

It is not really clear to me if the authors actually ran their attack or not. On the one hand, we have that a choice of parameters where γ < 2^{20} would correspond to  λ=16 as (2^{20})^{(1/5)} = 2^{4}. Hence, attacking such dimensions would not mean much. On the other hand, the estimates by the authors about LLL would mean their attack would cost 2^{135} operations.

However, as far as I can tell, it does not work for these kind of parameters. That is, LLL fails to find the desired relations once we choose parameters as they are suggested in the cryptographic literature (cf. the example in the Sage cell above).

There are two reasons why the algorithm might fail:

1. The target vectors might not be among the shortest vectors in the lattice. For the parameters on page 9 it seems this condition holds. It is not clear that the condition holds in general. While on page 7 the authors argue for the existence of target vectors within the approximation radius of LLL, the algorithm on page 8 expects vectors which are smaller than suggested by the Gaussian heuristic, which seems to be what is needed to me.
2. The target vectors are the shortest vectors in the lattice. However, LLL cannot find them. In such a case it seems the situation is somewhat similar to the situation discussed in this comment from [[http://eprint.iacr.org/2009/616.pdf%5D%5B%5B6]]]:

On the other hand, when t is large, ~v likely is the shortest vector in L, but known lattice reductions algorithms will not be able to find it efficiently. Specifically, as a rule of thumb, they require time roughly 2^{(t/k)} to output a 2^{k} approximation of the shortest vector. Since clearly there are exponentially (in t) many vectors in L of length at most |x_{0}|√(t + 1) < 2^{γ}√(t + 1), which is about 2^{(η−ρ)} times longer than ~v, we need better than a 2^{(η−ρ)} approximation. For t ≥ γ/η, the time needed to guarantee a 2^{η} approximation (which is not even good enough to recover ~v) is roughly 2γ/η^{2}.  Thus setting γ/η^{2} = ω(log λ) foils this attack.

So if I understand this correctly, they should have a condition on t which implies that the target vectors are smaller than what the Gaussian heuristic suggests by the appropriate LLL Unique SVP factor. In particular, they ask if there are target vectors with

|μ_i| < 1/√(t +1) 2^{(γ/(t+1) + t/4)}

but it should be more like

|μ_i| < τ/√(t +1) 2^{(γ/(t+1) – t/4)}

i.e. within the LLL approximation radius there shouldn’t be any other vectors (where τ is the Unique-SVP factor ~0.5).

Update: Since the authors show that if a short vector exists it must satisfy their conditions, this argument is invalid. Still, the authors did not show that they are able to find short enough vectors for parameters as they are found in the literature.

Of course, I might have missed something.

A Fully Homomorphic Cryptosystem with Approximate Perfect Secrecy

At CT-RSA 2013 a paper titled “A Fully Homomorphic Cryptosystem with Approximate Perfect Secrecy” by Michal Hojsík and Veronika Půlpánová was presented. Here is the abstract:

We propose a new fully homomorphic cryptosystem called Symmetric Polly Cracker (SymPC) and we prove its security in the information theoretical settings. Namely, we prove that SymPC approaches perfect secrecy in bounded CPA model as its security parameter grows (which we call approximate perfect secrecy). In our construction, we use a Gröbner basis to generate a polynomial factor ring of ciphertexts and use the underlying field as the plaintext space. The Gröbner basis equips the ciphertext factor ring with a multiplicative structure that is easily algorithmized, thus providing an environment for a fully homomorphic cryptosystem.

The proposal seems to have succeeded where we could not: a fully homomorphic encryption scheme that also is information theoretic secure. Indeed, the authors reference our work and point out that they are taking a different approach (from ours) which allows them to succeed in realising these two goals.

To understand the claim made, here’s a quick rehash of our Symmetric Polly Cracker (SPC) for d=1 and b=2.

The secret key is a Gröbner basis . To encrypt we pick  and publish where is the message we want to encrypt. Decryption is easy if we know because it is equivalent to computing normal forms modulo . Indeed, it can be shown that the problem of finding under a chosen plaintext attack is as hard as finding which we assume is a hard problem. This scheme is homomorphic: we can do additions and multiplications of ciphertexts which decrypt to the sums and products of plaintexts. However, the scheme is not fully homomorphic as the ciphertext size increases with each multiplication. Also, the problem of computing the Gröbner basis becomes easy once we published many encryptions, so the scheme only supports a limited number of encryptions. So far, so general.

Now, let’s take a look at the new approach. Despite the claim that “A Fully Homomorphic Cryptosystem with Approximate Perfect Secrecy” is a new approach, it is – as far as I can see – a tweak of this general construction (essentially going back to Koblitz and Fellows). The two tweaks are:

1. is augmented with the so-called “field polynomials” as they evaluate to zero on every element of (Note: the actual construction is slightly different, which I ignore here for clarity of presentation).
2. Instead of limiting the number of encryptions to some such that the Gröbner basis problem is assumed to be hard, the number of encryptions is limited to some value .

The first tweak means that after a certain number of multiplications ciphertexts do not grow in size any more. That is, the largest monomial (under some degree compatible ordering) is . This allows to call the scheme “compact” and hence allows to declare it a fully homomorphic scheme under the technical definition of compactness. Yet, this means that ciphertexts are exponentially big in (e.g., if , we are talking about ciphertexts with bits). I am not convinced these should be called “compact”.

The second tweak implies that a computationally unbound attacker’s chance of breaking the scheme approaches zero as approaches infinity. There simply aren’t enough equations to recover . Hence, at the cost of making the scheme exceptionally short-lived it is information theoretic secure (asymptotically).

Ring-LWE and the GB(N) Problem

Over at the Bristol Cryptography Blog Martijn Stam writes about our “Polly Cracker, Revisted” paper:

We did not discuss the paper  in great detail, but Jake did mention one interesting avenue for continued research. Given that this new approach allows one to cast both LWE and approximate GCD in the same framework, can one also capture ring-LWE. If so, this might enable a better comparison of the various fully homomorphic encryption (FHE) schemes out there. The hope expressed by Jake was that this might allow a reduction to standard LWE (for the current batch of ring-LWE schemes), which would boost our confidence in those schemes.

This motivated me to express the Ring-LWE problem in a language of Gröbner bases, here’s what I could come up with so far. Continue reading “Ring-LWE and the GB(N) Problem”

Chen & Nguyen’s algorithm and Arora & Ge’s algorithm

In Faster Algorithms for Approximate Common Divisors: Breaking Fully-Homomorphic-Encryption Challenges over the Integers Yuanmi Chen and Phong Q. Nguyen (preprint here) propose a new algorithm for solving the approximate GCD problem. It drops the complexity from $2^{2\rho}$ to $2^{3/2\rho}$ in the general case and from $2^{\rho}$ to $2^{\rho/2}$ in the partial case (one multiple of $p$ is given noise-free) which is a pretty big deal.

The algorithm is based on two key ideas (explained using the partial approximate GCD problem):

1. Noisy solving reduced to noise-free solving

Similar to Arora & Ge’s algorithm for solving LWE Chen and Nguyen reduce the approximate solving problem to a noise-free solving problem. In fact, the strategy is exactly the same (cf. also this post). Given noisy ideal elements $f_i = \sum h_i g_i + r_i$ where $g_i$ are generators of the ideal, $h_i$ are ring elements and $r_i$ are small noise terms, then

$F_i = f_i \cdot \prod (f_i + j)(f_i - j)$

will be elements of the ideal $I$ spanned by $g_i$ if $j$ is big enough (depending on the exact setup we may drop the $-j$ part). In the approximate GCD case $g_0$ is simply a small odd integer (often denoted $p$). Additionally, if we are given some sufficient “description” of some sufficiently big ideal $\langle G_1,\dots,G_s \rangle = J \supset I$ (i.e., all elements of $I$ are in $J$ but not vice versa and $J$ is considerably bigger than $I$) then we can compute

$F_i = f_i \cdot \prod (f_i + j)(f_i - j) \mod J$

which keeps the size of $F_i$ small-ish. This is the role of $x_0$, the noise free multiple of $p$ in the partial approximate GCD problem. Now, one simply solves the noise free system $F_1,\dots,F_m$. In the PAGCD case this means to compute a single GCD, in the multivariate polynomial case (including LWE) this means to compute a Gröbner basis (or linearise, which is the same thing for the cases we are concerned with). Hence, so far Arora&Ge and Chen&Nguyen are really the same thing (it should be mentioned that this ideal due to Nguyen was already mentioned in this paper) applied to different rings.

However, this is not really why the Chen & Nguyen algorithm is efficient (although this already provides a speed-up by a factor of 5).

2. Efficient multiplication

The key idea to drop the exponent from $2^{\rho}$ to $2^{\rho/2}$ is as follows. Instead of computing with integers we compute univariate polynomials mod $x_0$, i.e. one defines

$f_j(x) = \prod_{j=0}^{j-1} (x_1 - (x + i)) \in \mathbb{F}_{x_0}[x]$

and notices that for $\rho' = \lfloor \rho/2 \rfloor$:

$\prod_{i=0}^{2^\rho-1} (x_1 - i) = \prod_{k=0}^{2^{\rho - \rho'} -1} f_{2^{\rho'}}(2^{\rho'}k)$

i.e., we can reduce $2^\rho -1$ multiplications to $2^{\rho - \rho'} - 1$ multiplications and $2^{\rho - \rho'} - 1$  polynomial evaluations. It turns out, this can be done in $\mathcal{O}(2^{\rho'})$. For the details read the paper.

But to get back to my previous point: It turns out the Arora&Ge perspective on noisy system solving is also useful for approximate GCDs. Which provides further evidence that it is useful to generalise LWE and AGCD to ideal theoretic problems in multivariate polynomial rings.