We present novel variants of the dual-lattice attack against LWE in the presence of an unusually short secret. These variants are informed by recent progress in BKW-style algorithms for solving LWE. Applying them to parameter sets suggested by the homomorphic encryption libraries HElib and SEAL yields revised security estimates. Our techniques scale the exponent of the dual-lattice attack by a factor of when , when the secret has constant hamming weight and where is the maximum depth of supported circuits. They also allow to half the dimension of the lattice under consideration at a multiplicative cost of operations. Moreover, our techniques yield revised concrete security estimates. For example, both libraries promise 80 bits of security for LWE instances with and , while the techniques described in this work lead to estimated costs of 68 bits (SEAL) and 62 bits (HElib).
If you want to see what its effect would be on your favourite small, sparse secret instance of LWE, the code for estimating the running time is included in our LWE estimator. The integration into the main function estimate_lwe
is imperfect, though. To get you started, here’s the code used to produce the estimates for the rolling example in the paper.
Our instance’s secret has hamming weight and a ternary secret. We always use sieving as the SVP oracle in BKZ:
sage: n, alpha, q = fhe_params(n=2048, L=2) sage: kwds = {"optimisation_target": "sieve", "h":64, "secret_bounds":(-1,1)}
We establish a base line:
sage: print cost_str(sis(n, alpha, q, optimisation_target="sieve"))
We run the scaled normal form approach from Section 4 and enable amortising costs from Section 3 by setting use_lll=True
:
sage: print cost_str(sis_small_secret_mod_switch(n, alpha, q, use_lll=True, **kwds))
We run the approach from Section 5 for sparse secrets. Setting postprocess=True
enables the search for solutions with very low hamming weight (page 17):
sage: print cost_str(drop_and_solve(sis, n, alpha, q, postprocess=True, **kwds))
We combine everything:
sage: f = sis_small_secret_mod_switch sage: print cost_str(drop_and_solve(f, n, alpha, q, postprocess=True, **kwds))
Perhaps the most significant change is the change of development model. Previous versions of fplll, while open-source, were developed behind closed doors with tarballs being made available more or less regularly. Reporting a bug meant dropping Damien an e-mail.
Starting in autumn 2014, development is now coordinated publicly on GitHub. Developers send pull requests, reporting a bug means opening an issue, etc. Hence, development is more transparent and, most importantly, inviting. Additionally, for those who wish to get involved, we collect how-to information in our contributing guidelines. There is now also a public mailing list for all things fplll development and the occasional joint coding sprint.
Fplll 5.0.0 switches from C++98 to C++11. While we haven’t upgraded all code to take advantage of C++11’s features, such as rvalue references, we try to make use of them when touching code. Marc Stevens helped a lot here by educating the rest of us. I, personally, also found Effective Modern C++ quite a good read.
Fplll now also has a test suite, testing basic arithmetic, LLL, BKZ, SVP, sieving and the pruner. Test coverage is not complete but this quite an improvement over the 4.x series. We run these tests on every pull request and commit to master.
Writing code using fplll as a library instead of a command line program used to be hit and miss: did the compiler instantiate that template? We now force instantiation of templates and link against fplll as a library ourselves during testing. We also added pkg-config support and improved the build system so that make -j8
actually runs faster than make
.
Finally, we also provide API documentation online. You will notice that we adopted a naming convention inspired by PEP 8: ClassName.function_name
.
In LLL land not much has changed in fplll 5.0.0, e.g. HPLLL wasn’t merged into fplll yet. However, Shi Bai added optional support for double-double (106 bits) and quad-double (212 bits) precision using the qd library. In our experience, double-double provides a speed-up, but quad-double does not.
Marc contributed a new implementation of enumeration. This implementation is recursive but avoids the usual performance drawback of recursive enumeration by making the compiler untangle it during compile time. The new implementation is not as fast as it could be, but it is noticeable faster than what was in the 4.x series. In the process, we also made enumeration on different objects thread-safe by eliminating global variables.
fplll 5.0 is the first public (open-source or not) complete implementation of BKZ 2.0 (see https://github.com/Tzumpi1/BKZ_2 for a previous but incomplete implementation). As mentioned in a previous post, the collection of techniques known as BKZ 2.0 is used in lattice-based cryptography to estimate the cost of strong lattice reduction. This lead to the somewhat strange situation where everybody was essentially relying on a table in the BKZ 2.0 paper to predict the cost of certain cryptanalytical attacks without being able to reproduce or verify these numbers.
BKZ 2.0’s biggest improvement is due to the use of extreme pruning (Section 4.1 of the BKZ 2.0 paper). This, first of all, entails computing optimal pruning coefficients. The implementation in fplll for computing these coefficients — the pruner — was contributed by Léo Ducas. He also wrote the first implementation in Python for using these parameters in BKZ, i.e. by adding re-randomisation. I then re-implemented that part in C++ for fplll (and in Python for fpylll).
BKZ 2.0 also uses recursive preprocessing with BKZ in a smaller block size (Section 4.2 of the BKZ 2.0 paper). The implementation in fplll was written by me back in 2014.
Around the same time, Joop van der Pol contributed using the Gaussian heuristic bound in enumeration (Section 4.3 of the BKZ 2.0 paper)
Fplll also ships with strategies for BKZ reduction up to block size 90. These strategies provide pruning parameters and block sizes for recursive preprocessing. These strategies were computed using the strategizer discussed below.
Michael Walter contributed implementations of the Self-Dual BKZ algorithm and Slide Reduction. We don’t ship dedicated reduction strategies for these algorithms, but the default strategies should work reasonably well (I haven’t tried). Hence, these algorithms can now easily compared against each other and will all benefit from future improvements to fplll such as faster enumeration etc.
C++11 has made writing C++ a lot easier. Still, C++ might not be for everyone. Python, however, is for everyone. In particular, with Sage and SciPy, Python has become a central language for computational mathematics. To make it easy for researchers to try out new algorithmic ideas, tweak algorithms or simply to experiment with existing algorithms there is now fpylll which provides an interface to fplll’s API from Python and implements a few algorithms using that API in pure Python. see my previous post on fpylll for details.
The set of strategies shipped with fplll were computed using a Python library built on top of fpylll. This transparency allows others to reproduce and verify our choices or to improve them.
Shi Bai also contributed implementations of the GaussSieve as well as the TupleSieve. However, these can at present not be used as SVP oracles inside BKZ-style algorithms.
To get an impression of the difference between fplll 4.x and 5.x, consider the -ary lattice generated by calling
latticegen q 100 50 30 b -randseed 1337
In the table below, is the time in seconds it takes to run 10 tours of BKZ with block size and is the square of the Euclidean norm of the shortest vector in the reduced lattice.
software | strategy | time | ||
---|---|---|---|---|
fplll 4.0.4 | 40 | – | 326.43s | 1.10e10 |
fplll 5.0.0 | 40 | – | 75.71s | 1.22e10 |
fplll 5.0.0 | 40 | default | 3.64s | 1.17e10 |
fplll 5.0.0 | 60 | default | 120.67s | 8.85e9 |
Now that fplll 5.0.0 is out, we’ll work on integrating it into Sage (discussion and ticket).
The paper is partly motivated by that multiplication in previous schemes was complicated or at least not natural. Let’s take the BGV scheme where ciphertexts are simply LWE samples for and with being the message bit and is some “small” error. Let’s write this as because it simplifies some notation down the line. In this notation, multiplication can be accomplished by because . However, we now need to map back to using “relinearisation”, this is the “unnatural” step.
However, this is only unnatural in this particular representation. To see this, let’s rewrite as a linear multivariate polynomial . This polynomial evaluates to on the secret . Note that evaluating a polynomial on is the same as reducing it modulo the set of polynomials .
Now, multiplying produces a quadratic polynomial. Its coefficients are produced from all the terms in the tensor product . In other words, the tensor product is just another way of writing the quadratic polynomial . Evaluating on evaluates to . To map this operation to note that evaluating on is the same as taking the inner product of vector of coefficients of and the vector . For example, evaluating at and is the same as taking the inner product of and . That is, if we precompute all the products up to degree two of and the remaining operations are just an inner product.
Still, is a quadratic polynomial, we’d want a linear polynomial. In BGV this reduction is referred to as “relinearisation” (which is not helpful if you’re coming from a commutative algebra background). Phrased in terms of commutative algebra, what we’re doing is to reduce modulo elements in the ideal generated by .
Let’s look at what happens when we reduce by some element in the ideal generated by . We start with the leading term of which is . To reduce this term we’d add to , which is an element in the ideal generated by since it is a multiple of . This produces which has a smaller leading term. Now, we’d add the appropriate element with leading term and so on.
Of course, this process, as just described, assumes access to which is the same as giving out our secret . Instead, we want to precompute multiples with leading terms , and and publish those. This is still insecure, but adding some noise, assuming circular security and doing a bit more trickery we can make this as secure as the original scheme. This is then what “relinearisation” does. There is also an issue with noise blow up, e.g. multiplying a sample by a scalar like makes its noise much bigger. Hence, we essentially publish elements with leading terms , , , , , , , , , and so on, which allows us to avoid those multiplications. Before I move on to GSW13 proper, I should note that all this has been pointed out in 2011.
Ciphertexts in the GSW13 scheme are matrices over with entries. The secret key is a vector of dimensions over with at least one big coefficient . Let’s restrict . We say that encrypts when for some small . To decrypt we simply compute and check if there are large coefficients in .
Ciphertexts in fully homomorphic encryption schemes are meant to be added and multiplied. Starting with addition, consider . We have
Moving on to multiplication, we first observe that if matrices and have a common exact eigenvector with eigenvalues and , then their product has eigenvector with eigenvalue . But what about the noise?
Considering :
In the above expression , , , are all small, is not. In other words, the signal is not drowned by the noise and we have achieved multiplication. How far can we go with this, though?
Assume , , are all in absolute value. After multiplication is bounded by . Hence, if we have levels, the noise grows worse than . We require , otherwise the wrap-around will kill us. Also, must be sub-exponential in for security reasons. The bigger this gap, the easier the lattice problem we are basing security on and if the gap is exponential then there is a polynomial-time algorithm for solving this lattice problem: the famous LLL algorithm. As a consequence, the scheme as-is allows to evaluate polynomials of degree sublinear in .
To improve on this, we need a new notion. We call strongly bounded if the entries of are bounded by 1 and the error is bounded by .
In what follows, we will only consider a NAND
gate: . NAND
is a universal gate, so we can build any circuit with it. However, in this context its main appeal is that it ensures that the messages stay . Note the term above which is big if is big. If and are strongly bounded then error vector of is bounded by instead of . Now, If we could make the entries of bounded by 1 again, itself would be strongly bounded and we could repeat the above argument. In other words, this would enable circuits of depth , i.e. polynomial depth circuits (instead of merely polynomial degree) when is sub-exponential as above.
We’ll make use of an operation BitDecomp
which splits a vector of integers into a longer vector of the bits of the integers. For example, becomes which has length . Here’s a simple implementation in Sage:
def bit_decomp(v): R = v.base_ring() k = ZZ((R.order()-1).nbits()) w = vector(R, len(v)*k) for i in range(len(v)): for j in range(k): if 2**j & ZZ(v[i]): w[k*i+j] = 1 else: w[k*i+j] = 0 return w
We also need a function which reverses the process, i.e. adds up the appropriate powers of two: . It is called in the GSW13 paper, but I’ll simply call it … BitComp
.
def bit_comp(v): R = v.base_ring() k = ZZ((R.order()-1).nbits()) assert(k.divides(len(v))) w = vector(R, len(v)//k) for i in range(len(v)//k): for j in range(k): w[i] += 2**j * ZZ(v[k*i+j]) return w
Actually, the definition of BitComp
is a bit more general than just adding up bits. As defined above — following the GSW13 paper — it will add up any integer entry of multiplied with the appropriate power of two multiplied in. This is relevant for the next function we define, namely Flatten
which we define as BitDecomp(BitComp(⋅))
.
flatten = lambda v: bit_decomp(bit_comp(v))
Finally we also define PowersOf2
which produces a new vector from as :
def powers_of_two(v): R = v.base_ring() k = ZZ((R.order()-1).nbits()) w = vector(R, len(v)*k) for i in range(len(v)): for j in range(k): w[k*i+j] = 2**j * v[i] return w
For example the output of PowersOf2
on for is . Having defined these functions, let’s look at some identities. It holds that
which can verified by recalling integer multiplication. For example, . Another example: and
Or in the form of some Sage code:
q = 8 R = IntegerModRing(q) v = random_vector(R, 10) w = random_vector(R, 10) v.dot_product(w) == bit_decomp(v).dot_product(powers_of_two(w))
Furthermore, let be an dimensional vector and be a dimensional vector. Then it holds that
because .
Finally, we have
by combining the previous two statements.
For example, let , , .
BitComp
on gives BitDecomp
on gives ,The same example in Sage:
q = 8 R = IntegerModRing(q) a = vector(R, (2,3,0)) b = vector(R, (3,)) bit_comp(a).dot_product(b) == flatten(a).dot_product(powers_of_two(b))
Hence, running Flatten
on produces a strongly bounded matrix . Boom.^{1}
It remains to sample a key and to argue why this construction is secure if LWE is secure for the choice of parameters.
To generate a public key, pick LWE parameters and and sample as usual.
The secret key is of dimension . The public key is , where we roll into , which now has dimension .
To encrypt, sample an matrix with entries and compute which is an matrix. That is, we do exactly what we would do with Regev’s original public-key scheme based on LWE: doing random linear combinations of the rows of to make fresh samples. Run BitDecomp
on the output to get a matrix. Finally, set
For correctness, observe:
The security argument is surprisingly simple. Consider . Because is already the output of Flatten
it reveals nothing more than .
Unpacking to , note that is statistically uniform by the leftover hash lemma for uniform . Finally, is indistinguishable from a uniform by the decisional LWE assumption. Boom.
So far, this scheme is not more efficient than previous ring-based schemes such as BGV, even asymptotically. However, an observation by Zvika Brakerski and Vinod Vaikuntanathan in Lattice-Based FHE as Secure as PKE changed this. This observation is that the order of multiplications matters. Let’s multiply four ciphertexts .
The traditional approach would be:
In this approach the noise grows as follows:
Note the factor. That is, for multiplications we get a noise of size .
In contrast, consider this sequential approach:
Under this order, the noise grows as:
Note that each matrix is mutliplied by some only. Hence, here the noise grows linearly in number of multiplications i.e. as .
The idea of fplll days is inspired by and might follow the format of Sage Days which are semi-regularly organised by the SageMath community. The idea is simply to get a bunch of motivated developers in a room to work on code. Judging from experience in the SageMath community, lots of interesting projects get started and completed.
We intend to combine the coding sprint with the lattice meeting (to be confirmed), so we’d be looking at 3 days of coding plus 2 days of regular lattice meeting. We might organise one talk per coding day, to give people a reason to gather at a given time of the day, but the focus would be very much on working on fplll together.
If you’d like to attend, please send an e-mail to one of the maintainers e.g. me.
For example, the following Cython code cannot be interrupted:
while True: pass
On the other hand, if you have written Cython code in Sage, then you might have come across its nifty sig_on()
, sig_off()
and sig_check()
macros which prevent crashes and allow your calls to C/C++ code to be interrupted. Sage had signal handling — crashes, interrupts — forever (see below).
Cysignals is Sage’s signal handling reborn as a stand-alone module, i.e. it allows to wrap C/C++ code blocks in sig_on()
and sig_off()
pairs which catch signals such as SIGSEGV
. Using it is straight-forward. Simply add
include "cysignals/signals.pxi"
to each Cython module and then wrap long-running computations in sig_on()
/ sig_off()
pairs or check for signals with sig_check()
. See the cysignals documentation for details.
We have a first pre-release out. Pre-release because we haven’t switched Sage to the new, stand-alone code yet. Once this is done, we’ll publish version 1.0 since some version of this code has been in use on many different systems for at least decade.
Sage’s signal handling is rather old. It was definitely already there when I joined the project in 2006. The first version was written by William Stein (like many things Sage). The oldest record I can find is that in November 2006 I improved things a bit and sent William the following e-mail bragging about my results:
William,
my signal handler makeover seems to work, but there is one thing to clean up before I can submit it. In the new implementation a unique state struct is required, called _signals. As right now interrupt.o is linked statically into every extension module this cannot be accomplished. Obviously a shared library is the solution here and I’m facing the following decision: Shall I create a new spkg with a shared library called libcsage (or something else) which gets linked dynamically to every extension module or shall I build an extension module “interrupt” containing only C code and use/link that? Does distutils support that? I’ll go for the separate shared library if you don’t have an opinion on this because I assume that such a very thin (i.e. 3 functions or so) library could be handy also for other purposes like a malloc replacement etc.. I guess this “library” would be BSD-licensed (?) because of the signal handler stuff.
Also I was wondering if you could recommend any test cases for the signal handlers. I have tested: Catching a SIGSEGV in NTL works, Ctrl-C in echelon form computation modint works, KeyboardInterrupts in the Interpreter work. But I’m not sure what else to test.
Martin
PS:
SAGE 1.4:
sage: sage: time for i in range(100000): _ = ntl.GF2E_random() CPU times: user 1.33 s, sys: 0.78 s, total: 2.12 s Wall time: 2.12
My New Signal Handler:
sage: sage: time for i in range(100000): _ = ntl.GF2E_random() CPU times: user 0.38 s, sys: 0.06 s, total: 0.43 s Wall time: 0.43
No Signal Handler:
sage: sage: time for i in range(100000): _ = ntl.GF2E_random() CPU times: user 0.31 s, sys: 0.00 s, total: 0.32 s Wall time: 0.32
For your amusement, here is how long that computation takes today:
sage: K.<a> = GF(2^8) sage: C = ntl.GF2EContext(K) sage: time for i in range(100000): _ = ntl.GF2E_random(C) CPU times: user 44 ms, sys: 12 ms, total: 56 ms Wall time: 39 ms
In 2010, Jeroen Demeyer came along, put a serious amount of work into Sage’s signal handling and made it so much faster, more robust and overall better. In 2012, Volker Braun added more robust backtracks.
This code now becoming independent of Sage follows a similar trajectory. It started with a post of William on [sage-support]:
As an example, I’ve suggested numerous times that we should massively refactor the sage library to be a bunch of smaller Python libraries, develop them say on github (?), use Pypi and pip. If people would realize how important it is that we revamp how Sage development is done in a much less monolithic way, and better using existing tools, then I would be happy and enjoy watching as people do the revamp (e.g., like happened with switching from Mercurial to Git, which I didn’t do much on, but definitely supported). As it is, I sadly see that nobody seems to “get it” regarding how broken our development process is right now. So, I figure I’ll have to do something. Unfortunately, I can’t do anything now, due to lack of time. I hope that either I’m totally wrong, or that I do have time before Sage gets into deeper trouble.
In the mean time, I had copied Sage’s signal handling for fpylll. Because copying code is evil and motivated by William’s email, I decided to push for Sage’s signal handling being made independent of Sage and started working on it. Once I had a first prototype, Jeroen put in a serious amount of work and made everything better again.
We have just finalised the date and location for the First CoDiMa Training School in Computational Discrete Mathematics which will take place at the University of Manchester on November 16th-20th, 2015. This school is intended for post-graduate students and researchers from UK institutions. It will start with the 2-days hands-on Software Carpentry workshop covering basic concepts and tools, including working with the command line, version control and task automation, continued with introductions to GAP and SageMath systems, and followed by the series of lectures and exercise classes on a selection of topics in computational discrete mathematics.
The School’s website and the registration details will be announced shortly. In the meantime, if you’re interested in attending, please keep the dates in your diary and check for updates on our website and on our Twitter @codima_project, or tell us about your interest writing to contact at codima.ac.uk so that we will be able to notify you when the registration will begin.
The core of PolyBoRi is a C++ library, which provides high-level data types for Boolean polynomials and monomials, exponent vectors, as well as for the underlying polynomial rings and subsets of the powerset of the Boolean variables. As a unique approach, binary decision diagrams are used as internal storage type for polynomial structures.
On top of this C++-library we provide a Python interface. This allows parsing of complex polynomial systems, as well as sophisticated and extendable strategies for Gröbner base computation. PolyBoRi features a powerful reference implementation for Gröbner basis computation.
Boolean polynomials show up a lot in cryptography and other areas of computer science.
The trouble with PolyBoRi is that both authors of PolyBoRi – Alexander Dreyer and Michael Brickenstein – left academia and have jobs now which have nothing to do with PolyBoRi. Hence, PolyBoRi is currently not maintained. This is a big problem. In particular, there are some issues with PolyBoRi which cannot be ignored forever:
In the long-term the Singular team might get involved and keep PolyBoRi alive, but this is not certain. Also, there is a push for a decision about what to do with PolyBoRi in Sage now.
The current proposal on the table is to drop PolyBoRi from the default Sage installation, i.e. to demote it to an optional package. In my mind, this would be very bad as Sage and PolyBoRi benefit from the tight integration that currently exists. Also, in my experience, optional packages tend to simply not work that well as they are not tested in each release.
Hence, if you care about PolyBoRi you should consider to
I’m up for getting involved, but I don’t want to take on the responsibility alone.
Update (2015-06-13): A fair share of work has already been done by Andrew. Still, anyone up for helping out?
Mission To support the maintenance and development of the OpenDreamKit components, and in particular of the SageMath project, by enhancing their software infrastructure, in close collaboration with the community.
Activities According to his or her specific skills, the developer will take on, in close collaboration with the community, a selection of the software engineering tasks defined in the OpenDreamKit project. Those tasks concern:
- Improvements to the build, packaging, and distribution systems;
- Improvements to the portability of many of the components, in particular on the Windows+Cygwin platform;
- Refactoring of the static Sphinx-based documentation system, and design and development of dynamic documentation tools;
- Participation to the design and implementation of a component architecture for the SageMath ecosystem, as a fundation for Virtual Research Environments such as SageMathCloud;
- Active participation to regular international development and training meetings with the other OpenDreamKit participants and the community at large.
[…]
Skills and background requirements
- Fluency in C, C++, Python, …
- Strong experience with development and system administration in Unix-like environments;
- Experience with Windows development (Cygwin) desirable;
- Strong experience with software build systems (e.g. scons, automake);
- Strong experience in open-source development (collaborative development tools, interaction with the community, …);
- Experience with code optimization, parallelism, etc, appreciated but not a prerequisite.
- Experience with computational mathematics software, in particular SageMath, desirable but not a prerequisite;
- Mathematics background appreciated but not a prerequisite;
- Strong communication skills;
- Perfect fluency in English;
- Speaking French is appreciated but not a prerequisite.
Context The position will be funded by OpenDreamKit, a Horizon 2020 European Research Infrastructure project that will run for four years, starting from September 2015. This project brings together the open source computational mathematics ecosystem – and in particular LinBox, MPIR, SageMath, GAP, Pari/GP, LMFDB, Singular, MathHub, and the IPython/Jupyter interactive computing environment. – toward building a flexible toolkit for Virtual Research Environments for mathematics. Lead by Université Paris Sud this project involves about 50 people spread over 15 sites in Europe, with a total budget of about 7.6 million euros.
Within this ecosystem, the developer will work primarily on the free open-source mathematics software system Sagemath. Based on the Python language and many existing open-source math libraries, SageMath is developped since 10 years by a worldwide community of 300 researchers, teachers and engineers, and has reached 1.5M lines of code.
The developer will work within one of the largest teams of SageMath developers, composed essentially of researchers in mathematics and computer science, at the Laboratoire de Recherche en Informatique and in nearby institutions.
Comments This is not a postdoc position. While side research will be welcome, and a few tasks may possibly lead to some research problems in computer science, the core tasks will be pure development. Candidates wishing to pursue an academic research career in the long run should consider twice whether this opportunity is adequate for them.
For GGHLite the Ext-GCDH problem is defined as computing the higher order bits of from , , , and .
All elements involving a live in where is a power of two, the other elements live in . All elements except for and are small: is random in and . We define “levels” by the powers of , e.g. is a level- encoding of for some .
In a non-interactive key exchange (NIKE) based on GGH-like maps, the higher order bits of the value are essentially the shared secret. Let’s look at a simple Sage implementation of non-interactive key echange based on a GGH-like multilinear map.
class NIKESloppy: """ A sloppy implementation of NIKE for testing. By "sloppy" we mean that we do not care about distributions, verifying some properties or exact sizes. """ def __init__(self, n, kappa): """ Initialise a new sloppy GGHLite instance. :param n: dimension (must be power of two) :param kappa: multilinearity level `κ` """ self.n = ZZ(n) if not n.is_power_of(2): raise ValueError self.kappa = ZZ(kappa) self.R = ZZ["x"] self.K = CyclotomicField(2*n, "x") self.f = self.R.gen()**self.n + 1 self.sigma = self.n self.sigma_p = round(self.n**(ZZ(5)/2) * self.sigma) self.sigma_s = round(self.n**(ZZ(3)/2) * self.sigma_p**2) self.q = next_prime((3*self.n**(ZZ(3)/2) * self.sigma_s*self.sigma_p)**(3*self.kappa)) self.Rq = self.R.change_ring(IntegerModRing(self.q)) g = self.discrete_gaussian(self.sigma) # we enforce primality to ensure we can invert mod ⟨g⟩ while not self.K(g).is_prime(): g = self.discrete_gaussian(self.sigma) h = self.discrete_gaussian(sqrt(self.q)) z = self.Rq.random_element(degree=n-1) # encoding of one a = self.discrete_gaussian(self.sigma_p/self.sigma) self.y = ((1+a*g) * z.inverse_mod(self.f)) % self.f # encoding of zero b0 = self.discrete_gaussian(self.sigma_p/self.sigma) self.x0 = ((b0*g) * z.inverse_mod(self.f)) % self.f # encoding or zero b1 = self.discrete_gaussian(self.sigma_p/self.sigma) self.x1 = ((b1*g) * z.inverse_mod(self.f)) % self.f # zero testing parameter self.pzt = h*z**kappa * self.Rq(g).inverse_mod(self.f) # we migh as well give these to the attacker, as they can be computed # easily self.G = self.canonical_basis(g) self.I = self.K.ideal([self.R(row.list()) for row in self.G.rows()])
Logarithms of Euclidean norms are useful as “small” and “not-so-small” are what distinguishes between secret and not-so-secret.
@staticmethod def norm(f): "Return log of Euclidean norm of `f`" return log(max(map(abs, f)), 2.).n()
It is easy to compute a canonical basis for from publically available information. In this simplified version we simply give this canonical basis to the attacker.
def canonical_basis(self, g): "Return HNF of g" x = self.R.gen() n = self.n G = [x**n + ((x**i * g) % self.f) for i in range(n)] G = [r.list()[:-1] for r in G] G = matrix(ZZ, n, n, G) return G.hermite_form()
It will be useful below to compute short remainders modulo some number ring element. We use Babai’s rounding trick for that.
def rem(self, h, g): """ Compute a small remainder of `h` modulo `g` :param h: a polynomial in ZZ[x]/x^n+1 :param g: a polynomial in ZZ[x]/x^n+1 :returns: small representative of `h mod g` """ g_inv = g.change_ring(QQ).inverse_mod(self.f) q = (h * g_inv) % self.f q = self.R([round(c) for c in q.coefficients()]) r = (h - q*g) % self.f return r
We do not care about distributions of elements in our toy example, so we simply return random polynomials where the coefficients are bounded by .
def discrete_gaussian(self, sigma): """ sample element with coefficients bounded by sigma .. note:: this is obviously *not* a discrete Gaussian, but we don't care here. """ sigma = round(sigma) return self.R.random_element(x=-sigma, y=sigma, degree=self.n-1)
We need a function to sample new encodings:
def sample(self): """ Sample level-0 and level-1 encodeing. """ e0 = self.discrete_gaussian(self.sigma_p) r0 = self.discrete_gaussian(self.sigma_s) r1 = self.discrete_gaussian(self.sigma_s) u0 = ((e0*self.y + r0*self.x0 + r1*self.x1)) % self.f return e0, u0
For the final step we need to extract a representation over the Integers from our representation modulo . Sage does not normalise to but to . We work around this behaviour. Also, lower order bits are noisy so we discard them.
def extract(self, p, mul_pzt=True, drop_noise=True): """ Convert mod q element to element over the integers :param p: a polynomial in ZZ[x]/x^n+1 :param mul_pzt: multiply by pzt :param drop_noise: remove lower order terms """ p = p % self.f if mul_pzt: p = (self.pzt*p) % self.f q = parent(p).base_ring().order() f = [] for c in p.change_ring(ZZ).list(): if c<q/2: f.append(c) else: f.append(c-q) if drop_noise: f = [f[i] >> floor(log(self.q//2, 2)) for i in range(self.n)] return ZZ['x'](f)
Finally, we implement NIKE: we multiply level-1 encodings and our secret level-0 encoding to produce a level- encoding of from which we then extract our shared key.
def __call__(self, e0, U): """ Run NIKE with `e_0` and `U` :param e0: a level-0 encoding :param U: κ level-1 encodings """ if len(U) != self.kappa: raise ValueError t = prod(U) % self.f t = t * e0 % self.f t = self.extract(t) return tuple(t.list())
Let me stress once more that the above is not a proper implementation of a GGH-like candidate multilinear maps. For starters, the method called discrete_gaussian
does not return elements following a discrete Gaussian distribution. We also skipped several tests from the GGH and GGHLite specifications. However, this simplified variant is sufficient to explain the attack of Hu and Jia.
The attack proceeds in two steps. The first step was already known as the “weak discrete log attack”. (Update: Hu and Jia do not present the first step of their attack this way. Instead, they give another way of computing a “kind of” level-0 encoding from a, say, level-1 encoding. Damien Stéhle was the first to point out to me that this step can be replaced by the “normal” weak discrete log attack.) Say, we have some level- encoding . We can compute:
for some . Note that the right hand side has no modular reduction modulo . Also note that , i.e. we have a representative of .
Similarly, we can compute:
Now, assume is invertible modulo , then we can compute and . In other words, we can compute a representative of “kind of” at level-0. Here’s some Sage code implementing this step:
def weak_dlog(params, u): """ Weak discrete log attack from [EC:GarGenHal13]_ :param params: GGH parameters :param u: a level-1 encoding """ kappa = params.kappa v0 = params.extract(u*params.x0*params.y**(kappa-2), drop_noise=False) vn = params.extract(params.y**(kappa-1)*params.x0, drop_noise=False) v0 = params.K(v0).mod(params.I) vn = params.K(vn).mod(params.I) r = (v0 * vn.inverse_mod(params.I)).mod(params.I) return r.polynomial().change_ring(ZZ)
Now, compute representivatives for all and finally . This produces a “kind of” level-0 encoding of , where “kind of” stands for “not small”. If was small, we could simply compute to solve the Ext-GCDH problem. However, is not small.
This is where Hu and Jia come in with a clever and remarkably simple step two. They define:
Again, note that the right hand sides have no modular reductions modulo because they are “somewhat short”.
Recall that for some and compute:
where the second summand is small. In other words, we have computed
i.e. we have solved the Ext-GCDH problem.
Here’s some Sage code implementing these three steps:
def ggh_break(params, U): """ Attack from [EPRINT:HuJia15]_ :param params: GGH parameters :param U: κ+1 level-1 encodings """ if len(U) != params.kappa + 1: raise ValueError kappa = params.kappa # large "level-0" encodings of U V = [weak_dlog(params, u) for u in U] # large "level-0" encoding of prod(U) eta = prod(V) % params.f # Y = (b_0 ⋅ h ⋅ (1+ag))^(κ-2) ⋅ (1+ag) Y = params.extract(params.y**(kappa-1) * params.x0, drop_noise=False) # X = (b_0 ⋅ h ⋅ (1+ag))^(κ-2) ⋅ (b_0⋅g) X = params.extract(params.y**(kappa-2) * params.x0**2, drop_noise=False) # η'' = (Y ∏ u_i) mod X eta_ = (eta * Y) % params.f eta__ = params.rem(eta_, X) # η''' = η'' ⋅ y/x xy = params.y * params.x0.inverse_mod(params.f) % params.f eta___ = params.extract(xy * eta__, mul_pzt=False) return tuple(eta___.list())
Finally, let’s put everything together and run the attack. If everything is well, then the following code should print the same tuple three times.
def run(n, kappa): params = NIKESloppy(n, kappa) EU = [params.sample() for i in range(kappa+1)] E = [eu[0] for eu in EU] U = [eu[1] for eu in EU] # fist make sure NIKE is correct, by running it twice print params(E[0], U[1:]) print params(E[-1], U[:-1]) # now run the attack print ggh_break(params, U)