Cryptanalysis of the FHE based on GACD?

Jintai Ding and Chengdong Tao published a new preprint on the IACR’s ePrint titled A New Algorithm for Solving the Approximate Common Divisor Problem and Cryptanalysis of the FHE based on GACD.

*Abstract. *In this paper, we propose a new algorithm for solving the approximate common divisors problems, which is based on LLL reduction algorithm of certain special lattice and linear equation solving algorithm over integers. Through both theoretical argument and experimental data, we show that our new algorithm is a polynomial time algorithm under reasonable assumptions on the parameters. We use our algorithm to solve concrete problems that no other algorithm could solve before. Further more, we show that our algorithm can break the fully homomorphic encryption schemes, which are based on the approximate common divisors problem, in polynomial time in terms of the system parameter λ.

It is worth emphasising that the Approximate GCD problem not only underpinsone of the few fully homomorphic encryption schemes we have but it is also somewhat related to one of two candidates for multilinear maps. So if it could be shown to be easy then this would be somewhat sad as the choice of problems for building fancy crypto schemes would have gotten a lot smaller. So what is the Approxmiate GCD problem?

Approximate Greatest Common Divisions Problem: Given polynomially many samples x_{i} = q_{i}· p + r_{i} where x_{i} = O(2^{γ}), r_{i} = O(2^{ρ}) and p = O(2^{η}), recover p.

The algorithm proceeds by using the LLL algorithm to find relations .

Note that if enough such relations can be found then this gives a linear system of equations in the r_{j} which we’d only need to solve. So how does the algorithm use LLL to recover the a_{ij}? It sets up a lattice basis of dimension (t+1) × (t+1) as follows:

Here, N is simply a random integer O(2^{γ}). Now, the authors claim that running LLL on the lattice spanned by B returns about t-2 of the desired relations. They are unable to rigorously argue why this should happen but offer the following intuition. Any  vector  in the lattice spanned by B has the form . Considering the last component = = they speculate that that LLL would focus on the left hand side of this expression as the right hand side would be rather small anyway. Making implies which in turn implies , if I understood correctly.

An implementation of the first step of this algorithm for Sage is given here (in a Sage cell). Indeed, if you plug in the parameters from the authors’ table on page 9, we do get our desired relations out.

Finally, let’s look at the application to parameters as they are used in cryptography. The authors consider the fully homomorphic encryption scheme due to Marten van Dijkm Craig Gentry, Shai Halevi, Vinod Vaikuntanathan which sets γ = λ^{5}, η = λ^{2} and ρ = λ for a security level of λ, i.e. 2^{λ} operations should be needed to break it. Here is what the authors write:

We apply our algorithm to the parameters in [ 6 ] and we could break all the cases where their parameter γ < 2^{20}.

It is not really clear to me if the authors actually ran their attack or not. On the one hand, we have that a choice of parameters where γ < 2^{20} would correspond to  λ=16 as (2^{20})^{(1/5)} = 2^{4}. Hence, attacking such dimensions would not mean much. On the other hand, the estimates by the authors about LLL would mean their attack would cost 2^{135} operations.

However, as far as I can tell, it does not work for these kind of parameters. That is, LLL fails to find the desired relations once we choose parameters as they are suggested in the cryptographic literature (cf. the example in the Sage cell above).

There are two reasons why the algorithm might fail:

  1. The target vectors might not be among the shortest vectors in the lattice. For the parameters on page 9 it seems this condition holds. It is not clear that the condition holds in general. While on page 7 the authors argue for the existence of target vectors within the approximation radius of LLL, the algorithm on page 8 expects vectors which are smaller than suggested by the Gaussian heuristic, which seems to be what is needed to me.
  2. The target vectors are the shortest vectors in the lattice. However, LLL cannot find them. In such a case it seems the situation is somewhat similar to the situation discussed in this comment from [[http://eprint.iacr.org/2009/616.pdf%5D%5B%5B6]]]:

On the other hand, when t is large, ~v likely is the shortest vector in L, but known lattice reductions algorithms will not be able to find it efficiently. Specifically, as a rule of thumb, they require time roughly 2^{(t/k)} to output a 2^{k} approximation of the shortest vector. Since clearly there are exponentially (in t) many vectors in L of length at most |x_{0}|√(t + 1) < 2^{γ}√(t + 1), which is about 2^{(η−ρ)} times longer than ~v, we need better than a 2^{(η−ρ)} approximation. For t ≥ γ/η, the time needed to guarantee a 2^{η} approximation (which is not even good enough to recover ~v) is roughly 2γ/η^{2}.  Thus setting γ/η^{2} = ω(log λ) foils this attack.

So if I understand this correctly, they should have a condition on t which implies that the target vectors are smaller than what the Gaussian heuristic suggests by the appropriate LLL Unique SVP factor. In particular, they ask if there are target vectors with

|μ_i| < 1/√(t +1) 2^{(γ/(t+1) + t/4)}

but it should be more like

|μ_i| < τ/√(t +1) 2^{(γ/(t+1) – t/4)}

i.e. within the LLL approximation radius there shouldn’t be any other vectors (where τ is the Unique-SVP factor ~0.5).

Update: Since the authors show that if a short vector exists it must satisfy their conditions, this argument is invalid. Still, the authors did not show that they are able to find short enough vectors for parameters as they are found in the literature.

Of course, I might have missed something.

Chen & Nguyen’s algorithm and Arora & Ge’s algorithm

In Faster Algorithms for Approximate Common Divisors: Breaking Fully-Homomorphic-Encryption Challenges over the Integers Yuanmi Chen and Phong Q. Nguyen (preprint here) propose a new algorithm for solving the approximate GCD problem. It drops the complexity from 2^{2\rho} to 2^{3/2\rho} in the general case and from 2^{\rho} to 2^{\rho/2} in the partial case (one multiple of p is given noise-free) which is a pretty big deal.

The algorithm is based on two key ideas (explained using the partial approximate GCD problem):

1. Noisy solving reduced to noise-free solving

Similar to Arora & Ge’s algorithm for solving LWE Chen and Nguyen reduce the approximate solving problem to a noise-free solving problem. In fact, the strategy is exactly the same (cf. also this post). Given noisy ideal elements f_i = \sum h_i g_i + r_i where g_i are generators of the ideal, h_i are ring elements and r_i are small noise terms, then

F_i = f_i \cdot \prod (f_i + j)(f_i - j)

will be elements of the ideal I spanned by g_i if j is big enough (depending on the exact setup we may drop the -j part). In the approximate GCD case g_0 is simply a small odd integer (often denoted p). Additionally, if we are given some sufficient “description” of some sufficiently big ideal \langle G_1,\dots,G_s \rangle = J \supset I (i.e., all elements of I are in J but not vice versa and J is considerably bigger than I) then we can compute

F_i = f_i \cdot \prod (f_i + j)(f_i - j) \mod J

which keeps the size of F_i small-ish. This is the role of x_0, the noise free multiple of p in the partial approximate GCD problem. Now, one simply solves the noise free system F_1,\dots,F_m. In the PAGCD case this means to compute a single GCD, in the multivariate polynomial case (including LWE) this means to compute a Gröbner basis (or linearise, which is the same thing for the cases we are concerned with). Hence, so far Arora&Ge and Chen&Nguyen are really the same thing (it should be mentioned that this ideal due to Nguyen was already mentioned in this paper) applied to different rings.

However, this is not really why the Chen & Nguyen algorithm is efficient (although this already provides a speed-up by a factor of 5).

2. Efficient multiplication

The key idea to drop the exponent from 2^{\rho} to 2^{\rho/2} is as follows. Instead of computing with integers we compute univariate polynomials mod x_0, i.e. one defines

f_j(x) = \prod_{j=0}^{j-1} (x_1 - (x + i)) \in \mathbb{F}_{x_0}[x]

and notices that for \rho' = \lfloor \rho/2 \rfloor:

\prod_{i=0}^{2^\rho-1} (x_1 - i) = \prod_{k=0}^{2^{\rho - \rho'} -1} f_{2^{\rho'}}(2^{\rho'}k)

i.e., we can reduce 2^\rho -1 multiplications to 2^{\rho - \rho'} - 1 multiplications and 2^{\rho - \rho'} - 1  polynomial evaluations. It turns out, this can be done in \mathcal{O}(2^{\rho'}). For the details read the paper.

But to get back to my previous point: It turns out the Arora&Ge perspective on noisy system solving is also useful for approximate GCDs. Which provides further evidence that it is useful to generalise LWE and AGCD to ideal theoretic problems in multivariate polynomial rings.