A Fun Bug? Help Wanted!

Over at Low BDD Estimate · Issue #33 Ben reports a bug that, on first sight and unfortunately, does not stand out much. For some unusual parameters the Lattice Estimator reports wrong results. Given that we rely on a variety of heuristics to keep the running time down and given that the whole thing is mostly developed on the side, this is – again, unfortunately – not surprising.

Debugging “corner cases” can often do wonders to improve the robustness of a given piece of software. For example, back in the days when I worked a lot on M4RI, as much as I dreaded fixing bugs that only showed up on Solaris boxes, those bugs always revealed some shady assumptions in my code that were bound to produce problems elsewhere down the line.

Indeed, I think this bug puts the finger on the heuristics we rely upon and where they can go wrong. The parameter sets that Ben has in mind are quite unusual in that we have to pick quite a large dimension d (or, equivalently, a large number m of LWE samples) to make the target uniquely short. That is, I suspect fixing this bug would take a bit more than increasing the precision of some numerical computation here or there or to fix/add some if statement to account for a corner case. This makes bugs like these a high-ish priority.

On the other hand, truth be told, between this, the estimator being “mostly developed on the side” and all the other stuff I have to do, I doubt I’ll sink significant time into fixing this bug anytime soon.

But, and this is point of this post, perhaps someone would like to take this bug as an invitation to help to improve the Lattice Estimator? While, the estimator is quite widely relied upon to, well, estimate the difficulty of solving LWE and related problems, its bus factor is uncomfortably low. I’d say attempting to fix this bug would take whoever attempts to fix it on a whirlwind tour through the code base; a good way to learn it and to improve it.

Interested? Get in touch.

On BDD with Predicate: Breaking the “Lattice Barrier” for the Hidden Number Problem

Nadia and I put our pre-print and our source code online for solving bounded distance decoding when augmented with some predicate f(\cdot) that evaluates to true on the target and false (almost) everywhere else. Here’s the abstract:

Lattice-based algorithms in cryptanalysis often search for a target vector satisfying integer linear constraints as a shortest or closest vector in some lattice. In this work, we observe that these formulations may discard non-linear information from the underlying application that can be used to distinguish the target vector even when it is far from being uniquely close or short.

We formalize lattice problems augmented with a predicate distinguishing a target vector and give algorithms for solving instances of these problems. We apply our techniques to lattice-based approaches for solving the Hidden Number Problem, a popular technique for recovering secret DSA or ECDSA keys in side-channel attacks, and demonstrate that our algorithms succeed in recovering the signing key for instances that were previously believed to be unsolvable using lattice approaches. We carried out extensive experiments using our estimation and solving framework, which we also make available with this work.

Continue reading “On BDD with Predicate: Breaking the “Lattice Barrier” for the Hidden Number Problem”

Multicore BKZ in FPLLL

There have been a few works recently that give FPLLL a hard time when considering parallelism on multicore systems. That is, they compare FPLLL’s single-core implementation against their multi-core implementations, which is fair enough. However, support for parallel enumeration has existed for a while in the form of fplll-extenum. Motivated by these works we merged that library into FPLLL itself a year ago. However, we didn’t document the multicore performance that this gives us. So here we go.

I ran

for t in 1 2 4 8; do
  ./compare.py -j  $(expr 28 / $t)  -t $t -s 512 -u 80 ./fplll/strategies/default.json
done

where compare.py is from the strategizer and default.json is from a PR against FPLLL. Note that these strategies were not optimised for multicore performance. I ran this on a system with two Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz i.e. 28 cores. The resulting speed-ups are: multicore-bkz-in-fplll-default.png

As you can see, the speed-up is okay for two cores but diminishes as we throw more cores at the problem. I assume that this is partly due to block sizes being relatively small (for larger block sizes G6K – which scales rather well on multiple cores – will be faster). I also suspect that this is partly an artefact of not optimising for multiple cores, i.e. picking the trade-off between enumeration (multicore) and preprocessing (partly single-core due to LLL calls) right. If someone wants to step up and compute some strategies for multiple cores, that’d be neat.

Virtual FPLLL Days 6 aka Bounded Distance Development

The sixth FPLLL Days will be held on 19 and 20 November 2020. For obvious reasons they will be held online. (Who knows, we might end up liking the format enough to do more of these online in a post-COVID world, too).

As with previous incarnations of FPLLL Days, everyone who wishes to contribute to open-source lattice-reduction software is welcome to attend. In particular, you do not have to work on FPLLL or any of its sibling projects like FPyLLL or G6K. Work on whatever advances the state of the art or seems useful to you personally. That said, we do have the ambition to suggest projects from the universe of FPLLL and I personally hope that some people will dig in with me to do the sort of plumbing that keeps projects like these running. To give you an idea of what people worked on in the past, you can find the list of project of the previous FPLLL Days on the wiki and a report on the PROMETHEUS blog.

Format is yet to be determined. We are going to coordinate using Zulip. I think it would also be useful to have a brief conference call, roll call style, on the first day to break the ice and to make it easier for people to reach out to each other for help during the event. This is why we’re asking people to indicate their timezone on the Wiki. If you have ideas for making this event a success, please let us know.

PS: For the record: Joe threatened a riot if I didn’t call it “Bounded Distance Development”, so credit goes to him for the name.

Faster Enumeration-based Lattice Reduction

Our paper “Faster Enumeration-based Lattice Reduction: Root Hermite Factor k^{1/(2k)} in Time k^{k/8\, +\, o(k)}” – together with Shi Bai, Pierre-Alain Fouque, Paul Kirchner, Damien Stehlé and Weiqiang Wen – is now available on ePrint (the work has been accepted to CRYPTO 2020). Here’s the abstract:

We give a lattice reduction algorithm that achieves root Hermite factor k^{1/(2k)} in time k^{k/8 + o(k)} and polynomial memory. This improves on the previously best known enumeration-based algorithms which achieve the same quality, but in time k^{k/(2e) + o(k)}. A cost of k^{k/8 + o(k)} was previously mentioned as potentially achievable (Hanrot-Stehlé’10) or as a heuristic lower bound (Nguyen’10) for enumeration algorithms. We prove the complexity and quality of our algorithm under a heuristic assumption and provide empirical evidence from simulation and implementation experiments attesting to its performance for practical and cryptographic parameter sizes. Our work also suggests potential avenues for achieving costs below k^{k/8 + o(k)} for the same root Hermite factor, based on the geometry of SDBKZ-reduced bases.

Continue reading “Faster Enumeration-based Lattice Reduction”

Revisiting the Expected Cost of Solving uSVP and Applications to LWE

Our — together with Florian Göpfert, Fernando Virdia and Thomas Wunderer — paper Revisiting the Expected Cost of Solving uSVP and Applications to LWE is now available on ePrint. Here’s the abstract:

Reducing the Learning with Errors problem (LWE) to the Unique-SVP problem and then applying lattice reduction is a commonly relied-upon strategy for estimating the cost of solving LWE-based constructions. In the literature, two different conditions are formulated under which this strategy is successful. One, widely used, going back to Gama & Nguyen’s work on predicting lattice reduction (Eurocrypt 2008) and the other recently outlined by Alkim et al. (USENIX 2016). Since these two estimates predict significantly different costs for solving LWE parameter sets from the literature, we revisit the Unique-SVP strategy. We present empirical evidence from lattice-reduction experiments exhibiting a behaviour in line with the latter estimate. However, we also observe that in some situations lattice-reduction behaves somewhat better than expected from Alkim et al.’s work and explain this behaviour under standard assumptions. Finally, we show that the security estimates of some LWE-based constructions from the literature need to be revised and give refined expected solving costs.

Our work is essentially concerned with spelling out in more detail and experimentally verifying a prediction made in the New Hope paper on when lattice reduction successfully recovers an unusually short vector.

Denoting by v the unusually short vector in some lattice \Lambda of dimension d (say, derived from some LWE instance using Kannan’s embedding), \beta the block size used for the BKZ algorithm and \delta_0 the root-Hermite factor for \beta, then the New Hope paper predicts that v can be found if

\sqrt{\beta/d} \|v\| \leq \delta_0^{2\beta-d} \, {\mathrm{Vol}(\Lambda)}^{1/d},

under the assumption that the Geometric Series Assumption holds (until a projection of the unusually short vector is found).

The rationale is that this condition ensures that the projection of v orthogonally to the first d-\beta (Gram-Schmidt) vectors (denoted as \pi_{d-\beta+1}(v)) is shorter than the expectation for the d-\beta+1-th Gram-Schmidt vector b_{d-\beta+1}^* under the GSA and thus would be found by the SVP oracle when called on the last block of size \beta. Hence, for any \beta satisfying the above inequality, the actual behaviour would deviate from that predicted by the GSA. Finally, the argument can be completed by appealing to the intuition that a deviation from expected behaviour on random instances — such as the GSA — leads to a revelation of the underlying structural, secret information. In any event, such a deviation would already solve Decision-LWE.

In our work, we spell out this argument in more detail (e.g. how v is recovered from \pi_{d-\beta+1}(v)) and throw 23k core hours at the problem of checking if the predicted behaviour, e.g.

prediction.png

matches the observed behaviour, e.g.

observation.png

Just like for the above plots, the general answer is a clear “yes”.

Pretty Pictures or GTFO

I forgot the most important bit. The behaviour of the BKZ algorithm on uSVP(-BDD) instances can be observed in this video.

You can observe the basis approaching the GSA until the SVP oracle finds the unusually short vector \pi_{d-\beta+1}(v). From \pi_{d-\beta+1}(v), v is then immediately recovered using size reduction. The grey area is the currently worked on block. The notation in the legend isn’t consistent with the plots above or even internally (n v d), but the general idea should still be apparent. In case you’re wondering about the erratic behaviour of the tails (which occasionally goes all over the place), this is due to a bug in fpylll which has recently been fixed.

Fplll Days 3: July 6 – 14, Amsterdam

We’ll have an fplll coding sprint aka “FPLLL Days” in July. This time around, we plan a slightly modified format compared to previous instances. That is, in order to encourage new developers to get involved, we plan to have a 2 day tutorial session (shorter or longer depending on participants/interest) before the start of FPLLL Days proper.

Continue reading “Fplll Days 3: July 6 – 14, Amsterdam”

fplll 5.1 and fpylll 0.2.4dev

New versions of fplll and fpylll were released today. I’ve reproduced release notes below for greater visibility. The biggest user-visible changes for fplll are probably that

  • CVP enumeration is not experimental any more,
  • support for external enumeration libraries (go write that GPU implementation of enumeration) was added and
  • support for OSX was greatly improved.

On the fpylll side, the biggest user-visible changes are probably various API updates and a much nicer strategy/framework for gathering statistics about BKZ.

The next version of fplll will contain support for LLL reduction on Gram matrices.

Continue reading “fplll 5.1 and fpylll 0.2.4dev”

Cryptanalysis of the FHE based on GACD?

Jintai Ding and Chengdong Tao published a new preprint on the IACR’s ePrint titled A New Algorithm for Solving the Approximate Common Divisor Problem and Cryptanalysis of the FHE based on GACD.

*Abstract. *In this paper, we propose a new algorithm for solving the approximate common divisors problems, which is based on LLL reduction algorithm of certain special lattice and linear equation solving algorithm over integers. Through both theoretical argument and experimental data, we show that our new algorithm is a polynomial time algorithm under reasonable assumptions on the parameters. We use our algorithm to solve concrete problems that no other algorithm could solve before. Further more, we show that our algorithm can break the fully homomorphic encryption schemes, which are based on the approximate common divisors problem, in polynomial time in terms of the system parameter λ.

It is worth emphasising that the Approximate GCD problem not only underpinsone of the few fully homomorphic encryption schemes we have but it is also somewhat related to one of two candidates for multilinear maps. So if it could be shown to be easy then this would be somewhat sad as the choice of problems for building fancy crypto schemes would have gotten a lot smaller. So what is the Approxmiate GCD problem?

Approximate Greatest Common Divisions Problem: Given polynomially many samples x_{i} = q_{i}· p + r_{i} where x_{i} = O(2^{γ}), r_{i} = O(2^{ρ}) and p = O(2^{η}), recover p.

The algorithm proceeds by using the LLL algorithm to find relations .

Note that if enough such relations can be found then this gives a linear system of equations in the r_{j} which we’d only need to solve. So how does the algorithm use LLL to recover the a_{ij}? It sets up a lattice basis of dimension (t+1) × (t+1) as follows:

Here, N is simply a random integer O(2^{γ}). Now, the authors claim that running LLL on the lattice spanned by B returns about t-2 of the desired relations. They are unable to rigorously argue why this should happen but offer the following intuition. Any  vector  in the lattice spanned by B has the form . Considering the last component = = they speculate that that LLL would focus on the left hand side of this expression as the right hand side would be rather small anyway. Making implies which in turn implies , if I understood correctly.

An implementation of the first step of this algorithm for Sage is given here (in a Sage cell). Indeed, if you plug in the parameters from the authors’ table on page 9, we do get our desired relations out.

Finally, let’s look at the application to parameters as they are used in cryptography. The authors consider the fully homomorphic encryption scheme due to Marten van Dijkm Craig Gentry, Shai Halevi, Vinod Vaikuntanathan which sets γ = λ^{5}, η = λ^{2} and ρ = λ for a security level of λ, i.e. 2^{λ} operations should be needed to break it. Here is what the authors write:

We apply our algorithm to the parameters in [ 6 ] and we could break all the cases where their parameter γ < 2^{20}.

It is not really clear to me if the authors actually ran their attack or not. On the one hand, we have that a choice of parameters where γ < 2^{20} would correspond to  λ=16 as (2^{20})^{(1/5)} = 2^{4}. Hence, attacking such dimensions would not mean much. On the other hand, the estimates by the authors about LLL would mean their attack would cost 2^{135} operations.

However, as far as I can tell, it does not work for these kind of parameters. That is, LLL fails to find the desired relations once we choose parameters as they are suggested in the cryptographic literature (cf. the example in the Sage cell above).

There are two reasons why the algorithm might fail:

  1. The target vectors might not be among the shortest vectors in the lattice. For the parameters on page 9 it seems this condition holds. It is not clear that the condition holds in general. While on page 7 the authors argue for the existence of target vectors within the approximation radius of LLL, the algorithm on page 8 expects vectors which are smaller than suggested by the Gaussian heuristic, which seems to be what is needed to me.
  2. The target vectors are the shortest vectors in the lattice. However, LLL cannot find them. In such a case it seems the situation is somewhat similar to the situation discussed in this comment from [[http://eprint.iacr.org/2009/616.pdf%5D%5B%5B6]]]:

On the other hand, when t is large, ~v likely is the shortest vector in L, but known lattice reductions algorithms will not be able to find it efficiently. Specifically, as a rule of thumb, they require time roughly 2^{(t/k)} to output a 2^{k} approximation of the shortest vector. Since clearly there are exponentially (in t) many vectors in L of length at most |x_{0}|√(t + 1) < 2^{γ}√(t + 1), which is about 2^{(η−ρ)} times longer than ~v, we need better than a 2^{(η−ρ)} approximation. For t ≥ γ/η, the time needed to guarantee a 2^{η} approximation (which is not even good enough to recover ~v) is roughly 2γ/η^{2}.  Thus setting γ/η^{2} = ω(log λ) foils this attack.

So if I understand this correctly, they should have a condition on t which implies that the target vectors are smaller than what the Gaussian heuristic suggests by the appropriate LLL Unique SVP factor. In particular, they ask if there are target vectors with

|μ_i| < 1/√(t +1) 2^{(γ/(t+1) + t/4)}

but it should be more like

|μ_i| < τ/√(t +1) 2^{(γ/(t+1) – t/4)}

i.e. within the LLL approximation radius there shouldn’t be any other vectors (where τ is the Unique-SVP factor ~0.5).

Update: Since the authors show that if a short vector exists it must satisfy their conditions, this argument is invalid. Still, the authors did not show that they are able to find short enough vectors for parameters as they are found in the literature.

Of course, I might have missed something.