Finding numbers to use for p, g, and x.

Aaron Logue, Sep 2013

Here is a small collection of utilities with C source that implements constructed (using Maurer's algorithm with all factors of p-1 prime) and probabilistic (randomly generated and validated with the Rabin-Miller algorithm) methods of generating primes, a primitive root finder for guaranteeing 1 to p-1 generators, a prime number tester, a sequential probabilistic prime number generator for primes large or small, and a tool to help manage primes and proofs when using multiple computers to find suitable primes. Disclaimer! - I am a computer programmer, not a mathematician. Exercise all appropriate caution.
pgen.tar.gz - Linux     pgen.zip - Windows
A test program is included. If it reports errors, be sure to set int_type and
int2x_type in bigint.h to 32-bit and 64-bit types for your machine's architecture.
See the page on the bigint library for more info about that if needed.

The command-line utilities and how to use them: pgenp -b 120 -n 50 -o 3
  • pgen - Construct p = 2qr + 1 primes
  • proots - Find primitive roots of p = 2qr + 1 primes
  • pmerge - Merge lists of pregenerated r values
  • pgenp - Generate random primes
  • pseq - Generate sequential primes
  • ptest - Test a number for primality

The Diffie-Hellman algorithm allows two computers to swap two numbers
and then use them to independently compute a shared secret.  The secret
remains secret, even if the numbers swapped were not.

g^x mod p is the formula at the heart of this mathematical sleight (handled
by the bigint_pow_mod(g,x,p,result) function in bigint.c).  If we call the
two computers that want to compute a shared secret Alice and Bob, then the
algorithm works like this:  First, Alice and Bob agree on values to use
for g and p.  g is a small number, and p is a large prime that should have
at least one large prime factor of p-1.  Alice makes up a secret random
number x such that g^x is a great many times larger than p, and Bob makes
up a secret number y.  Alice computes X = g^x mod p and Bob computes
Y = g^y mod p.  They swap X and Y, and then Alice computes S = Y^x mod p
while Bob computes S = X^y mod p.  Alice and Bob arrive at the same secret
value S because exponentiation and modulo division are commutative.  That is,
(g^x mod p)^y mod p = (g^y mod p)^x mod p.

If an eavesdropper could figure out what random numbers Alice and Bob used
for x and y, they would have S.  However, the mod function throws the original
number away and just gives the remainder after a division, so even if an
eavesdropper sees the result of g^x mod p and knows what g and p are, they
will have to test a great many values of x before finding one that yielded
the result, provided that p is large.

There is an attack called Pohlig-Hellman against values of p that happen to
have small factors for p-1.  pgen produces three-factor values of p-1 with
one of them being 2 and two of them being large primes.  It should be
difficult to find the values of the q and r primes used to construct the
prime p = 2qr + 1.  Thus, the primes generated by pgen should defend against
the Pohlig-Hellman attack.

pgen stores the factors of p-1 in a proofs file which is used by the proots
program to compute primitive roots of p for the generator g.  When g is
a primitive root of p, g^x mod p yields a cyclic group of the maximum size,
which is all values from 1 to p-1.

--------------------------------------------------------------------

Here is how I used these tools to get to a working test key exchange:

I began with a few runs of "pgen -e" on the Windows boxes to scramble
their random number generators, then ran "pgen -b 1023 -m -n 200"
on three different machines and let them run overnight.  This yielded
600 1023-bit proven prime numbers constructed using Maurer's algorithm.
I combined all of them in one file using the pmerge program, then
distributed the combined list of primes to each of the machines and ran
"pgen -b 2048 -z" on all of them.

In a little over an hour I had several proven 2048-bit primes with large
prime factors of p-1 written to the primes.proof file to choose from.
I picked one (it happened to begin with 28939...) and computed a short
list of 10 maximum-period generators with "proots -r 28939 -n 10".
proots scanned the primes.proof file for a prime beginning with "28939",
verified that r of p-1 was prime, and computed the first 10 generators mod p.
I picked one of the generators to use for g: 5.  Finally, I ran
"pmerge primes -d 28939 -o 16 -s 64" to display the prime in hex and
tested it out with this program:

/*****************************************************************************
 * Perform a mock Diffie-Hellman key exchange using a 2048-bit prime
 ****************************************************************************/
#include <stdio.h>
#ifdef WIN32
#include "skein.h"
#endif
#include "bigint.h"

#define G 5
#define P "E53DF5FC3F650D066875837012A4E7BEA863C65CB592D9C36942CF69CBC6DD4F"\
          "D804E19CCF2696C9BEBCF18742FA5FB091CBDE1782E8291009464913ECE19745"\
          "7800EA6E43B0E2A64615D182B6DE150479C58D1C7C702D47EA3031B379CA13A2"\
          "048C964E1D1E8D4CD3815D0895BF31E53271D4607E16461B77FB26100915D679"\
          "9060203EDEBFEA9495A5A8E7CED68FC9DB2D47CE7992461BA78174608AD0BBE3"\
          "F5E63EC6C960564430CBD2E6E587D08EE12F94B5B99DFFB12C6727A25E800DAC"\
          "6CD8DE77A5BBC93B36E444B070888CB5ADD991870466968A6E9A23C2EE0A1D67"\
          "1C9B601081A44AA6A58D4DC76686EF15FCE1C9AEB4033395A9B24BE1AA1929BB"

int main(int argc, char *argv[]) {
 bigint_t *g, *p, *x, *y, *X, *Y, *a, *b;
#ifdef WIN32
unsigned long ticks = GetTickCount();
#endif
   g = bigint_init(32, G); p = bigint_init(32, 0); x = bigint_init(32, 0);
   y = bigint_init(32, 0); X = bigint_init(32, 0); Y = bigint_init(32, 0);
   a = bigint_init(32, 0); b = bigint_init(32, 0); bigint_atoi(p, P, 16);

   bigint_random(x, 512);
   printf("Alice makes up secret number x ...%08x%08x\n", x->b[1], x->b[0]);
   bigint_pow_mod(g, x, p, X);
   printf("Alice computes X = g^x mod p = ...%08x%08x\n", X->b[1], X->b[0]);

   bigint_random(y, 512);
   printf("Bob makes up secret number y   ...%08x%08x\n", y->b[1], y->b[0]);
   bigint_pow_mod(g, y, p, Y);
   printf("Bob computes   Y = g^y mod p = ...%08x%08x\n", Y->b[1], Y->b[0]);

   bigint_pow_mod(Y, x, p, a);
   printf("Alice computes a = Y^x mod p = ...%08x%08x\n", a->b[1], a->b[0]);
   bigint_pow_mod(X, y, p, b);
   printf("Bob computes   b = X^y mod p = ...%08x%08x\n", b->b[1], b->b[0]);

   if (bigint_compare(a, b) == 0) {
      printf("Alice and Bob now share a %d-bit secret.\n", bigint_width(a));
   } else {
      printf("Now what we have here is a failure to communicate.\n");
   }
   bigint_free(g); bigint_free(p); bigint_free(x); bigint_free(y);
   bigint_free(X); bigint_free(Y); bigint_free(a); bigint_free(b);
#ifdef WIN32
printf("Runtime: %lu ticks\n", GetTickCount() - ticks);
#endif
   return 1;
}

C:\dev\primes>exchange_test
Alice makes up secret number x ...7cb95859eb246ff8
Alice computes X = g^x mod p = ...4004a15e51573b9e
Bob makes up secret number y   ...946af0d568153560
Bob computes   Y = g^y mod p = ...5c167bd291d91150
Alice computes a = Y^x mod p = ...97e12de9fbfc520d
Bob computes   b = X^y mod p = ...97e12de9fbfc520d
Alice and Bob now share a 2047-bit secret.
Runtime: 34610 ticks

The smaller x is, the faster this runs.  However, the bit-width of
x should be at least twice the symmetric-key equivalent strength
of the discrete logarithm for a given size of p.  What that is exactly
is a subject of debate, but the idea is that g^x needs to be bazillions
of times larger than p because working out how many times it wrapped
past p is a successful attack.  I might use something like 256 bits for a
1024-bit modulus and 320 bits for a 2048-bit modulus.  The above run took
about 9 seconds per g^x mod p with a 512-bit x and a 2048-bit p on a
1.8 Ghz laptop.

Cheers!