Ossa Sepia

December 28, 2017

EuCrypt Chapter 3: Miller-Rabin Implementation

Filed under: EuCrypt — Diana Coman @ 9:26 p.m.

~ This is part 4 of the EuCrypt series. Start with Introducing Eucrypt. ~

Primality testing1 is a key part of any implementation of RSA2 and therefore a key part of EuCrypt as well. At first glance, there is a wide choice of primality tests that one can use, from naive direct divisions in search of factors to prime number generators and statistical primality tests. However, many of those are not currently very practical for EuCrypt as they simply take too long to run on the sort of very large numbers that RSA has to use to avoid the simplest brute-force attacks. As a result, the choice narrows down considerably to include mainly probabilistic primality tests: Fermat, Solovay-Strassen and Miller-Rabin. Of those, Miller-Rabin is simply best currently: it has the lowest error probability upper bound among the three and it is at most as expensive computationally as the others. Arguably a deterministic, polynomial-time algorithm such as AKS3 would be even better than Miller-Rabin, but unfortunately AKS is currently still too slow for EuCrypt's needs. Consequently, the chosen algorithm for primality testing in EuCrypt is Miller-Rabin mainly because of a lack of a working better alternative.

If you think that "everyone uses Miller-Rabin anyway", think again. While GnuPG and mostly everyone else using RSA is indeed likely to claim that they are also using Miller-Rabin for primality testing, you'd be well advised to check such claims very, very closely because claims are just cheap labels that stick to anything just the same. If you actually take the time to check those claims and then necessarily find yourself peeling down label after label trying to get to the actual thing that is there as opposed to what the labels claim it is, you might find that every new label further dilutes the original meaning. That's how Koch in his GnuPG sticks the "random" label on pseudo-random at best, that's how 4096 bit randomly-generated keys contain actually at most 4090 pseudo-randomly generated bits and so on until you might even find as I did last time that bits and parts of the implementation do nothing in fact. Don't take my word for it either: go and check for yourself, it's a very healthy habit that might save your very skin some day.

Despite being called a "primality test", Miller-Rabin (like all the other probabilistic primality tests) is more of a compositeness test: the algorithm can prove that a number is composite, but it can not actually prove that a number is indeed prime. Essentially, the given number is suspected of the crime of being composite (as opposed to the desired prime) and witnesses for its compositeness are sought. If one single witness for compositeness is found, then the given number is indeed composite. However, if no witness is found, Miller-Rabin can only reach a relatively weaker conclusion, namely that the given number is likely to be prime. How likely? That depends to a significant degree on the choice of candidate witnesses: how many candidate witnesses the algorithm was asked to investigate and how it actually chose them.

In its search for witnesses, Miller-Rabin relies on the important fact that most numbers between 1 and n-1 are reliable witnesses for n, if n is indeed a composite number. More precisely, at most 1/4 of those numbers are strong liars for n, meaning that at most 1/4 of them will fail to reveal the compositeness of n, when investigated by Miller-Rabin. As a result, the more witnesses investigated, the lower the chances of a composite number to pass for a prime one. Assuming that witnesses are indeed chosen randomly4, the algorithm's error probability is at most (1/4)^t, where t is the number of witnesses investigated. Obviously, each additional witness adds to the cost of running the algorithm and for this reason EuCrypt exposes this as a knob5 for you, the user, to set depending on your own needs. Use it and use it wisely!

The updated eucrypt/smg_rsa/include/knobs.h:


#define ENTROPY_SOURCE "/dev/ttyUSB0"

 * This is the number of witnesses checked by the Miller-Rabin (MR) algorithm for each candidate prime number.
 * The value of M_R_ITERATIONS directly affects the outer bound of MR which is calculated as 4^(-M_R_ITERATIONS)
 * S.MG's choice of 16 here means an outer bound of 4^(-16) = 0.0000000002,
    which is currently considered sufficient for Eulora's needs.
    If your needs are different, change this knob accordingly.
 * NB: if you use this to make keys for some serious use, an outer bound of 1e-10 is really not nearly good enough
    and therefore you'll probably want to *increase* the value of this knob.
#define M_R_ITERATIONS 16

#endif /*SMG_RSA_KNOBS_H*/

The default EuCrypt value for M_R_ITERATIONS is 16 and that means 16 randomly chosen candidate witnesses that are checked. By contrast, GnuPG 1.4.10 at first glance appears to check 5 candidate witnesses (as per cipher/primegen.c call is_prime(ptest, 5, &count2)) and at a deeper investigation it turns out that it checks 1 fixed witness (magic number 2, because why not) and 4 pseudo-randomly chosen ones at best. The label that was 5 but acted more like 4 and the parameter that didn't quite stand for what you'd expect, isn't that precisely the sort of thing you want in your cryptographic tool? No? Then stop using code from the swamps, start using signed code and in any case always read the darned code before you use it because otherwise that's exactly what you will get, each and every time: something other than what it seems, something continuously and rather invisibly to you drifting further away from what you need.

Leaving aside GnuPG for now, let's dive straight in and implement Miller-Rabin using the MPI functions as if they actually worked well6. The algorithm is quite straight forward and the code aims to be as short and clear as possible, with comments to help you follow along. The function is called is_composite, to reflect the fact that Miller-Rabin really checks for compositeness, regardless of the fact that we might prefer it to be otherwise. The n parameter is the actual large number (hence, stored as an MPI) that is suspected of being composite. The nwitnesses parameter is the number of randomly chosen witnesses to check (this is called "security parameter" in some reference books, most notably in Handbook of Applied Cryptography by Menezes, van Oorschot and Vanstone, 1997). You can also think of this nwitnesses as "number of iterations" because each iteration is effectively the check of one candidate witness. Finally, the third parameter, entropy_source, is the handler of an already opened and properly configured source of true random bits (see Chapter 2 for how this is set up in EuCrypt). First, the added function signature in eucrypt/smg_rsa/include/smg_rsa.h:


 * This is an implementation of the Miller-Rabin probabilistic primality test:
 *   checking the specified number of randomly-chosen candidate witnesses
 *    (i.e. with an outer bound of (1/4)^nwitnesses).
 * NB: a 1 result from this test means that the given n is indeed composite (non-prime)
    but a 0 result does not fully guarantee that n is prime!
    If this doesn't make sense to you, read more on probabilistic primality tests.
 * @param n the candidate prime number;
    the function will investigate whether this number is composite or *likely* to be prime.
    How likely? It depends on the number of witnesses checked, see next parameter.
 * @param nwitnesses this is the number of randomly chosen candidate witnesses to the compositeness of n
      that will be checked; the outer bound of the algorithm depends on this.
 * @param entropy_source the source of entropy (ready to read from) that will be used
    to choose candidate witnesses to the compositeness of n.
* @return 1 if at least one witness to the compositeness of n has been found
      (i.e. n is certainly composite);
      0 if no witness to the compositeness of n was found (i.e. it is likely that n is prime)
 * NB: the probability that n is *not* prime although this function returned 0 is
      less than (1/4)^nwitnesses, but it is NOT zero.
int is_composite( MPI n, int nwitnesses, int entropy_source);

And the corresponding implementation, in a new file eucrypt/smg_rsa/primegen.c :

/* primegen.c - prime number generation and checks
 * S.MG, 2017


#include "smg_rsa.h"

 * is_composite
 * Returns 1 if it finds evidence that n is composite and 0 otherwise.
 * NB: this is a probabilistic test and its strength is directly linked to:
 *  - the number of witnesses AND
 *  - the quality of the entropy source given as arguments!

int is_composite( MPI n, int nwitnesses, int entropy_source) {
  int evidence = 0;
  int nlimbs = mpi_get_nlimbs(n);       /* see MPI representation   */
  unsigned int nbits = mpi_get_nbits(n);        /* used bits        */
  unsigned int noctets = (nbits + 7) / 8; /* source works on octets */
  MPI nminus1 = mpi_alloc(nlimbs);      /* n-1 value is used a LOT  */
  MPI mpi2 = mpi_alloc_set_ui(2);         /* 2 as MPI               */
  MPI a = mpi_alloc(nlimbs);      /* candidate witness              */
  MPI y = mpi_alloc(nlimbs);      /* intermediate values            */
  MPI r = mpi_alloc(nlimbs);      /* n = 1 + 2^s * r                */
  int s;                          /* n = 1 + 2^s * r                */
  int j;                          /* counter for loops              */
  int nread;              /* number of random octets actually read  */

  /* precondition: n > 3 */
  assert( nbits > 2 );

  /* nminus1 = n - 1 as MPI                                         */
  mpi_sub_ui( nminus1, n, 1);

   * find r odd and s so that n = 1 + 2^s * r
   * n-1 = 2^s * r
   * s is given by the number of trailing zeros of binary n-1
   * r is then obtained as (n-1) / (2^s)
  s = mpi_trailing_zeros( nminus1 );
  mpi_tdiv_q_2exp(r, nminus1, s);

   * Investigate randomly chosen candidate witnesses.
   * Stop when either:
      * the specified number of witnesses (nwitnesses) have
        been investigated OR
      * a witness for compositeness of n was found
  while (nwitnesses > 0 && evidence == 0) {
    unsigned char *p = xmalloc(noctets);
    do {
      nread = get_random_octets_from(noctets, p, entropy_source);
    } while (nread != noctets);

    mpi_set_buffer(a, p, noctets, 0);
    /* ensure that a < n-1 by making a maximum nbits-1 long:
        * clear all bits above nbits-2 in a
        * keep value of bit nbits-2 in a as it was
    if (mpi_test_bit(a, nbits - 2))
      mpi_set_highbit(a, nbits - 2);
      mpi_clear_highbit(a, nbits - 2);

    /* ensure that 1 < a < n-1; if not, try another random number
     * NB: true random means a CAN end up as 0 or 1 here.

    if (mpi_cmp(a, nminus1) < 0 && mpi_cmp_ui(a, 1) > 0) {
      /* calculate y = a^r mod n */
      mpi_powm(y, a, r, n);
      if (mpi_cmp_ui(y, 1) && mpi_cmp(y, nminus1)) {
        j = 1;
        while ( (j < s) && mpi_cmp(y, nminus1) && (evidence == 0) ) {
          /* calculate y = y^2 mod n */
          mpi_powm(y, y, mpi2, n);
          if (mpi_cmp_ui(y, 1) == 0)
            evidence = 1;
          j = j + 1;
        } /* end while */
        if (mpi_cmp(y, nminus1))
          evidence = 1;
      } /* end if y != 1 and y != n-1 */
      nwitnesses = nwitnesses - 1;
    } /* end if 1 < a < n-1 */
  } /* end while for investigating candidate witnesses */

  mpi_free( nminus1 );
  mpi_free( mpi2 );
  mpi_free( a );
  mpi_free( y );
  mpi_free( r );

  return evidence;

The variable evidence is initially 0 as Miller-Rabin does not yet have any evidence about n being composite. When and if evidence of compositeness is found for n, this variable will get updated to 1. If the whole algorithm finishes without updating this variable, it means that n is probably prime, with a maximum probability error of (1/4)^nwitnesses, as previously discussed. In any case, this variable holds the result of the Miller-Rabin test at any moment and its value is the one returned when the test finishes.

The nlimbs and nbits are basically measures of how long the MPI n actually is and they are initialised with values returned by the corresponding MPI functions. The nbits value is then converted to number of octets (in noctets) for the very simple reason that the source of randomness in EuCrypt anyways reads at the moment full octets rather than individual bits. This is of course a matter of choice as you could change the setting of the source to read bit by bit, but I can't quite see at the moment any significant advantage to that.

Having obtained this basic length-information on n, the function then goes on to declare and allocate memory for a set of MPI variables that it will need for the Miller-Rabing algorithm itself. Note that there is *no* use of the so-called "secure" memory thing from MPI for the unfortunate reason that the existing implementation of secure is very much an empty label: all it does is to set a flag so that theoretically the memory is not swapped, but there is no guarantee to either that or to the more useful fact that nobody else can read that memory. So given that there is in fact no secure memory implementation no matter how much it would be useful if there was one, EuCrypt takes instead the honest and practical approach of making it clear that it uses plain memory and nothing else. No label if no actual matching object to stick it on, as simple as that.

Once the needed variables are declared and initialised when appropriate, a precondition is checked: assert( nbits > 2); What's this? It's a sanity check essentially because Miller-Rabin is meant for checking large integers, not tiny ones. Moreover, due to the insanity of the underlying MPI which considers in its infinite stupidity that 0 for instance is represented on 0 bits, tiny values of less than 4 (hence, represented on less than 3 bits) will... block the whole thing. Let me point out for now just the very simple fact that the algorithm uses nbits-1 and nbits-2 meaning that nbits should better be at least 3 or otherwise it ends up trying to work with numbers represented on 0 bits and other such nonsense. So instead of risking working with nonsense, EuCrypt uses this assert call to abort the whole thing rather than propagate the nonsense even one instruction further.

Oh, if you wonder by any chance whether GnuPG bothers to even consider such corner cases, that's good for you. I'm sure it's all right if they don't because such cases "should never happen" and "nobody calls Miller-Rabin on small numbers" and all those wonderful castles of trust built on nothing but air. Come to think about it, I even enjoy blowing up such air-supported castles and what not, they make a most-satisfying poufffff! Do you enjoy living in them? POUFFFFF!

The rest of the function closely follows the Miller-Rabin algorithm and the comments in the code hopefully make it easier to understand what each line does even when using the MPI calls. Note that candidate witnesses are chosen indeed randomly by using the specified source and making sure that the call truly returned the requested number of random octets. However, the actual number of random bits in any candidate random number will be by necessity nbits-1 because of the constraint that the random candidate should be less than n-1. This constraint is enforced by simply clearing any bits above nbits-2 (bit numbering starts at 0 so last bit is n-1 rather than n) but keeping at the same time the value of bit on position nbits-2. If that was 1 then mpi_set_highbit(a, nbits-2) is called. If that was 0 then mpi_clear_highbit(a, nbits-2) is called instead.

Note that those 2 mpi functions (mpi_set_highbit and mpi_clear_highbit) are supposedly similar in that they clear all bits above the position indicated and otherwise set to 1 or, respectively, to 0 the bit on the given position. However, the actual code reveals that they are not entirely similar: mpi_set_highbit allocates more memory if the position given is above the current size of the mpi; mpi_clear_highbit doesn't allocate memory in this case. This means effectively that mpi_set_highbit returns an mpi of specified length, while mpi_clear_highbit returns always an mpi of length smaller than the specified bit position. At first glance this might seem to make some sense but the reality is worse than that: mpi_clear_highbit sometimes trims the leading 0 bits of a number and sometimes... doesn't! Possibly for this reason of rather dubious behaviour, GnuPG's Miller-Rabin avoids using mpi_clear_highbit entirely and dances around instead with double calls to mpi_set_highbit instead, on both branches of the if. Since I'm doing coding here rather than voodooing or naked-dances-in-the-code, I fixed instead mpi_clear_highbit to at least reliably trim leading 0s at all times and I'm using it where it is needed. The memory allocation issue is not relevant for my code here anyway because there is already enough memory allocated for the MPI at the beginning of this function.

A core aspect of any Miller-Rabin implementation is the way in which candidate witnesses are chosen. EuCrypt chooses them entirely at random in the interval [2, n-2]. Different options were considered, most notably that of choosing a smaller, potentially more relevant interval, for instance by increasing the lower bound of this interval to at least 256 (2^8). Moreover, according to Eric Bach's paper7, an appropriate upper bound for witnesses would be 2*log(n)*log(n). However, Bach's result relies on the Extended Riemann Hypothesis (ERH) which hasn't been proved so far. So although those bounds are very appealing, EuCrypt sticks for now with using randomly chosen witnesses over the whole interval.

To sum it all up, the main changes brought by today's vpatch are the following:

  1. Addition of primegen.c as part of smg_rsa. This includes the actual implementation of the Miller-Rabin algorithm with the choices discussed above. It uses the MPI implementation introduced in Chapter 1 and corrected along the way.
  2. Changes to MPI, namely further identification and, to the extent needed for current needs of EuCrypt, the killing of additional cockroaches that were identified in the MPI code as a result of investigating the functions that are needed for Miller-Rabin. Most notably: fixing mpi_clear_highbit so that it always trims leading 0 if any, as opposed to current functionality where it sometimes trims them and sometimes not; identifying the fact that MPI currently considers that 0 is represented on 0 bits. Note that this last issue is flagged up and made obvious through updated tests but it is not changed mainly because following all its potential implications through MPI at this stage would eat up so much time as to make it cheaper to just implement from scratch something solid at least. So for the time being at least, the decision made is to honestly admit and clearly highlight this existing fault of MPI.
  3. New tests for MPI highlighting the new issues uncovered.
  4. New tests for smg_rsa focusing on Miller-Rabin: the testing program allows the user to specify the number of runs for each data point and reports a test as failed if at least one run returned a result different from the one expected; test data for Miller-Rabin is chosen to include a few mersenne primes, carmichael composites and a phuctor-found non-prime public exponent of someone's RSA key. Feel free to add to them anything you consider relevant and then run those tests!
  5. A small change to the function that fetches random octets from the FG: the errno is now set to 0 prior to every call that reads from the USB port and its value is then checked in addition to the return value of the read function. This change is needed in order to avoid the unfortunate case when no bits are read but there is apparently no underlying error. Previous version of my function would still abort in such case but current version would instead keep trying as this is a more useful approach for EuCrypt.

The updated get_random_octets_from function in eucrypt/smg_rsa/truerandom.c:

int get_random_octets_from(int noctets, unsigned char *out, int from) {

  int nread;
  int total = 0;

  while (total < noctets) { errno = 0; nread = read(from, out+total, noctets-total); //on interrupt received just try again if (nread == -1 && errno == EINTR) continue; //on error condition abort if (errno != 0 && (nread == -1 || nread == 0)) { printf("Error reading from entropy source %s after %d read: %sn", ENTROPY_SOURCE, total, strerror(errno)); return total; //total read so far } if (nread > 0)
      total = total + nread;
  return total; //return number of octets read

The new test_is_composite function in eucrypt/smg_rsa/tests/tests.c:

void test_is_composite(int nruns, char *hex_number, int expected) {
  int i;
  int output;
  int count_ok = 0;
  int source = open_entropy_source(ENTROPY_SOURCE);
  MPI p = mpi_alloc(0);

  mpi_fromstr(p, hex_number);
  printf("TEST is_composite on MPI(hex) ");
  mpi_print(stdout, p, 1);
  for (i=0; i < nruns; i++) {
    output = is_composite(p, M_R_ITERATIONS, source);
    if (output == expected)
      count_ok = count_ok + 1;
  printf("done, with %d out of %d correct runs for expected=%d: %sn", count_ok, nruns, expected, count_ok==nruns? "PASS"$

The updated main in eucrypt/smg_rsa/tests/tests.c:

int main(int ac, char **av)
  int nruns;
  int id;

  if (ac<2) {
    printf("Usage: %s number_of_runs [testID]n", av[0]);
    return -1;
  nruns = atoi(av[1]);

  if (ac < 3) id = 0; else id = atoi(av[2]); if (id == 0 || id == 1) { printf("Timing entropy source...n"); time_entropy_source(nruns,4096); } if (id == 0 || id == 2) { /* a few primes (decimal): 65537, 116447, 411949103, 20943302231 */ test_is_composite(nruns, "0x10001", 0); test_is_composite(nruns, "0x1C6DF", 0); test_is_composite(nruns, "0x188DD82F", 0); test_is_composite(nruns, "0x4E0516E57", 0); /* a few mersenne primes (decimal): 2^13 - 1 = 8191, 2^17 - 1 = 131071, 2^31 - 1 = 2147483647 */ test_is_composite(nruns, "0x1FFF", 0); test_is_composite(nruns, "0x1FFFF", 0); test_is_composite(nruns, "0x7FFFFFFF", 0); /* a few carmichael numbers, in decimal: 561, 60977817398996785 */ test_is_composite(nruns, "0x231", 1); test_is_composite(nruns, "0xD8A300793EEF31", 1); /* an even number */ test_is_composite(nruns, "0x15A9E672864B1E", 1); /* a phuctor-found non-prime public exponent: 170141183460469231731687303715884105731 */ test_is_composite(nruns, "0x80000000000000000000000000000003", 1); } if (id > 2)
    printf("Current test ids: 0 for all, 1 for entropy source test only, 2 for is_composite test only.");

  return 0;

Finally, the .vpatch itself and my signature (you'll need all previous .vpatches of EuCrypt to press this one):

In the next chapter we take one step further towards having RSA, so stay tuned, drink only the good stuff and make sure you won't go... POUFFF!

  1. Primality testing means answering the question: is this number prime or not? 

  2. If this is not clear to you, it's best to just review RSA itself. In a nutshell: the whole working of RSA as cryptographic tool relies on properties of prime numbers; both secret and private RSA keys are essentially made out of prime numbers. Don't eat only this nutshell though - better read the original paper by Rivest, Shamir and Adleman: A Method for Obtaining Digital Signatures and Public-Key Cryptosystems

  3. Agrawal, Kayal and Saxena, Primes is in P, Ann. of Math, Vol. 2, pp. 781-793. 

  4. Which is *not* the case in GnuPG where first witness is actually...fixed and the rest are anyway chosen pseudo-randomly, go and read that code. By contrast, EuCrypt actually chooses them randomly, using a true random number generator, the FG. 

  5. Like all other knobs, this can be found in include/knobs.h 

  6. No, they don't, surprise, surprise. So I'll fix what I use and otherwise at least highlight what other problems I find, as I find them, such is this wonderful world we live in. 

  7. Bach, Eric, 1990. Explicit Bounds for Primality Testing and Related Problems. Mathematics of Computation, 55, pp. 355-380. 

December 21, 2017

EuCrypt: Correcting MPI Implementation

Filed under: EuCrypt — Diana Coman @ 9:53 p.m.

~ An unexpected part of the EuCrypt library series. Start with Introducing EuCrypt. ~

This is a sad but necessary interruption in the EuCrypt series itself: although coming immediately after chapter 2, this is not chapter 3 at all, I'm sorry to say. Instead of adding another useful part of smg-rsa as the actual chapter 3 does, I'm forced by the uncovered reality in the field to publish this correction: a correction of a coding error that has lived for years in the very much used GnuPG 1.4.10 open sore code. As a wonderful and truly epic example of all the great qualities of the open-source approach that thrives on thousands of eyes that surely quickly and accurately find and correct any bug, this error has survived perfectly fine until now, undetected and even papered over and buried deeper when it tried to manifest! It is quite natural after all, not to mention according to the "law": given enough eyeballs, any bug will be quickly plastered over and buried as deep as it can be. There, Mr. Raymond, I fixed that Bazaar description for you. Let it be said again and again and forever that in anything that matters, it's never quantity that matters, but always quality.

The error itself is a tiny thing with a huge impact: a single wrong letter in the wrong place, transforming a piece of code meant to copy over a whole MPI into a piece of code that does precisely... nothing1. Obviously, typing the wrong letter can and does happen from time to time to anyone who types. However, it takes a very sloppy sort of monkey pushing the keys to fail to notice its effect, namely that the resulting code does... nothing. You'd think that anyone would at least read their code once more at a later time to catch such silly mistakes, pretty much like one reads any other text for mere proofreading if nothing more. You'd think that anyone at anytime would at least run their code once to see it doing the thing they wanted it to do, wouldn't you? Well, it clearly shows this is not the way of the great Bazaar, no. So let me show you the resulting code that achieves nothing but still actually eats up some resources whenever called, here you are, straight from mpi/include/mpi-internal.h:

#define MPN_COPY_INCR( d, s, n)
    do {
  mpi_size_t _i;
  for( _i = 0; _i < (n); _i++ )
      (d)[_i] = (d)[_i];
    } while (0)


Speaking of eyeballs, take your time a bit and spot the error. Then call your child or your grandpa over, ask them to spot the error too and let me know how this addition of another pair of eyeballs clearly helped.

Once the error is spotted, the reader might ask the obvious question, of course: "why does the caller appear to work?" The answer to this is... layered let's say. The caller does not actually work in fact, unsurprisingly. The caller is a method called mpi_tdiv_q_2exp(MPI w, MPI u, unsigned count). This method supposedly shifts u by count bits to the right and stores the result in w. Except that it doesn't actually *always* do this: in some cases, all it does is to trim from w count bits and nothing more because it relies on the above good-for-nothing code to copy the relevant bits from u to w. Here, let's look at its code that is found in mpi/mpi-div.c:

mpi_tdiv_q_2exp( MPI w, MPI u, unsigned count )
    mpi_size_t usize, wsize;
    mpi_size_t limb_cnt;

    usize = u->nlimbs;
    limb_cnt = count / BITS_PER_MPI_LIMB;
    wsize = usize - limb_cnt;
    if( limb_cnt >= usize )
  w->nlimbs = 0;
    else {
  mpi_ptr_t wp;
  mpi_ptr_t up;

  RESIZE_IF_NEEDED( w, wsize );
  wp = w->d;
  up = u->d;

  count %= BITS_PER_MPI_LIMB;
  if( count ) {
      mpihelp_rshift( wp, up + limb_cnt, wsize, count );
      wsize -= !wp[wsize - 1];
  else {
      MPN_COPY_INCR( wp, up + limb_cnt, wsize);

  w->nlimbs = wsize;

Let's actually read that a bit and see how it works: usize stores the number of "limbs" of an MPI, while limb_cnt stores the number of full limbs that need to be discarded as a result of a shift right by count bits. Those limbs are essentially machine words, groups of bits that the machine can work with directly. Their length in bits is that BITS_PER_MPI_LIMB, hence the integer division of count by it to find out just how many full limbs a shift by count bits implies. Once that is known, the code sets the remaining size (as number of limbs) of the result w and proceeds to copy over from u the bits that remain after the shift (the wp and up are simply pointers to the internal array of limbs of w and u respectively). However, at this point, the path splits by means of that if (count): the shift can further affect another limb by discarding (shifting right) only some of its bits and in this case there is in fact a shift-and-copy operation by means of calling mpihelp_rshift; if this is not the case and the original shift was made by a multiple of BITS_PER_MPI_LIMB, there is only a straightforward copy of limbs to make and that is supposedly achieved by our do-nothing-by-error code, called MPN_COPY_INCR.

Considering the above, it's hopefully no surprise to you when I say that mpi_tdiv_q_2exp fails of course, as soon as it makes use of the buggy code of MPN_COPY_INCR. And you'd think the error would at least be caught at this point if it wasn't caught earlier, wouldn't you? Basic testing of the most obvious kind, meaning both branches of an if, something everyone always does in one way or another, right? Bless your heart and sweet youth and all that but wake up already, as this is open sore we are talking about: no, it's not caught here either, of-course-not, it's just-not-enough-eyeballs-yet. Honestly, I suspect they meant "given enough eyeballs LOST as a result, any bug becomes shallow." We might even live to see this, you know?

Still, there is more! This defective mpi_tdiv_q_2exp is actually used by another method in the original GnuPG so surely the error is exposed there! It has to be, it's by now the third layer and presumably it's really not all that many layers that can be built on code-that-does-nothing + code-that-works-only-sometimes. Oh, but the Bazaar is all mighty, the Bazaar can2, unlike those pesky cathedrals that actually had any foundation. Let me illustrate this to you, with code taken straight from GnuPG 1.4.10, from the primegen.c file, from method is_prime that supposedly does probabilistic primality testing and such little things on which the security of your RSA key depends:

    q = mpi_copy( nminus1 );
    k = mpi_trailing_zeros( q );
    mpi_tdiv_q_2exp(q, q, k);

What's that, you ask? Why, nothing terrible, just a bit of code meant to calculate q as nminus1 / (2^k). Since a division by 2^k is simply a right shift by k bits, that's really the perfect job for mpi_tdiv_q_2exp, isn't it? So then why does this code do first a copy of nminus1 into q and only then a shift right by k bits of resulting q into... same variable q. Why exactly doesn't it do simply a shift right by k bits of nminus1 into q? Like this:

  k = mpi_trailing_zeros( nminus1 );
  mpi_tdiv_q_2exp( q, nminus1, k );

Two method calls instead of three, two lines instead of three, overall much clearer and easier to follow code anyway, since we are dividing indeed nminus1 by 2^k, not some "q" that is meant to be the result, really. So why do it in a complicated way when you can do it in a simple way? Except, you can't, you see, because you don't actually read the code you use and the code you use is broken but nobody actually bothered to check it. And when the error that is by now 3 layers deep manifests itself through unexpected results of the simple and straightforward code that should have been there, you don't spend like me the hours needed to track down the error and actually, finally, mercifully put it out of its misery correcting the code. Oh no, that would be wasted time, wouldn't it? Instead you are being productive and you find a workaround, simply papering the error over and dancing around it with an idiotic extra-copy that makes *this* code more difficult to follow and further *hides* the previous error, pushing it one layer deeper. Oh, now I plainly see the main advantage of open source: since you are not responsible in any way for the code you write, you can get away with such despicable behaviour, can't you?

Well, here is not open source. Here code gets actually read and errors get dissected, documented and corrected when found, not swept under the carpet. So, to correct this error, first thing written is a basic test that runs all the branches of that if in mpi_tdiv_q_2exp. Here it is, together with its helper print function, both to be found from now on in mpi/tests/test_mpi.c:

void print_results(MPI in, MPI out, char * title)
  fprintf(stdout, "******** %s ********", title);

  fprintf(stdout, "input : ");
  mpi_print(stdout, in, 1);

  fprintf(stdout, "output: ");
  mpi_print(stdout, out, 1);


 * Test that will fail on original code and will pass after EuCrypt fix is applied.
void test_rshift()
  MPI out, in, copy_in;
  out = mpi_alloc(0);
  in = mpi_alloc(0);
  copy_in = mpi_alloc(0);

  mpi_fromstr(out, "0x20E92FE28E1929");   /* some value */
  mpi_fromstr(in, "0x2000000010000001000000002");
  mpi_fromstr(copy_in, "0x2000000010000001000000002"); /* to make sure the actual input is print, since call can modify in */

  /* print value of BITS_PER_MPI_LIMB */
  fprintf(stdout, "BITS_PER_MPI_LIMB is %dn", BITS_PER_MPI_LIMB);

  /* shift by 0 */
  mpi_tdiv_q_2exp(out, in, 0);
  print_results(copy_in, out, "TEST: right shift by 0");

  /* shift by multiple of BITS_PER_MPI_LIMB */
  mpi_fromstr(in, "0x2000000010000001000000002");

  mpi_tdiv_q_2exp(out, in, BITS_PER_MPI_LIMB);
  print_results(copy_in, out, "TEST: right shift by BITS_PER_MPI_LIMB");

  /* shift by non-multiple of BITS_PER_MPI_LIMB */
  mpi_fromstr(in, "0x2000000010000001000000002");
  mpi_tdiv_q_2exp(out, in, BITS_PER_MPI_LIMB - 3);
  print_results(copy_in, out, "TEST: right shift by BITS_PER_MPI_LIMB - 3");


Running this with the old code from mpi (so the code from chapter 2, earlier) will show just how mpi_tdiv_q_2exp "works" so that there is no doubt remaining about that. After this and before doing any change, the next step is to search in the code for *any other* users of either mpi_tdiv_q_2exp or the MPN_COPY_INCR itself, since those users for all I know might even rely by now on the wrong behaviour. Mercifully, this search returned empty and so I can proceed finally to the fix itself, which is of course very easy to do *after all this work* i.e. once it was found and tracked down and isolated to ensure that a change does not break down something else. And after this, the last but mandatory step is of course to run the previously written test again and check its output. The corrected code in mpi/include/mpi-internal.h:

#define MPN_COPY_INCR( d, s, n)
    do {
  mpi_size_t _i;
  for( _i = 0; _i < (n); _i++ )
      (d)[_i] = (s)[_i];
    } while (0)

After all this work that was caused not by the original tiny error itself, not by the wrong letter at the wrong place, but by the total and repeated failure to correct it afterwards, even 3 layers up, tell me again about all those eyeballs and about how productive it is to write a quick workaround instead of searching for the error and eliminate it at source. Then go and run a grep -r "workaround" . in any source directory of any big open source project, just for... fun let's say, what else.

To wrap this up, here is the vpatch with all the above changes, and my signature for it:

Hopefully no other similar trouble will surface until next week, so that I can finally move on to Chapter 3 that is all about the Miller-Rabin algorithm. Stay tuned!

  1. You know, by this point one might even be relieved to find that it does at least nothing. As opposed to doing something worse. 

  2. "Yes, we can", right?  

December 17, 2017

My Reference Code Shelf

Filed under: Coding — Diana Coman @ 6:59 p.m.

Reading code is a pain or a gain, depending on whose1 code you read, of course. And while I had a rather significant share of pain since my recent return to programming a few years ago, I recently enjoyed as well some gains. Reading code is quite like reading literature in this respect really, except that for code2 there are precious few - if any - editors to intently review some proposed piece of code and personally, publicly vouch for it. Or on the contrary but just as useful, publicly vouch *against* it.

Arguably, vouching against some code might be more salutary - and even more sanitary - in the light of the horrors that can be routinely found in the current brave world of everyone can code. Still, I'm an optimist here3 and I'd much rather focus therefore on the positive: the reading and reviewing of good code, the vouching for the rare but therefore even more precious pieces of code that are a delight to read and, more importantly, a rock-solid piece of foundation on which one can build further. Note that perfection is neither implied nor the objective here as code is not some ideal construct made by Gods or out of pure thought: code that I sign is code I consider honest, useful and readable, approximately in this order. My definition for those may even differ from yours, so make and assume your own decisions on this matter. Just like I am making and publicly assuming mine.

For such important pieces of code that I got to read, to know and to accept as honest, useful and readable, I will provide therefore my signature for whatever it may be worth to another reader and a place on my public shelf from where one can download the code itself, my signature for it and any other signatures for it from people that I trust essentially with this important but not always easy task of reviewing code.

This is in my opinion the least I can do to support the very sane and very dear to me approach of rewarding intelligence so I see it essentially as my duty too.

  1. Yes, code IS quite personal in this sense - it's not an anonymous or machine-made artefact but someone's own output and as such part of their quite personal public image. 

  2. My suspicion is that the situation is rather similar for literature nowadays, but literature has seen happier days, what can I say more. 

  3. The sort of optimist who finds out from experience that idiocy has infinite points

December 14, 2017

EuCrypt Chapter 2: A Source of Randomness

Filed under: EuCrypt — Diana Coman @ 11:53 p.m.

EuCrypt uses as source of randomness the Fuckgoats auditable TRNG (True Random Number Generator) from S.NSA (No Such lAbs). The choice here was made very easy by a basic combination of facts: on one hand, EuCrypt needs an actual, auditable source of randomness1 as opposed to anything else, pseudo-random generators included; on the other hand, the Fuckgoats (FG) device is the only currently available auditable TRNG. So problem solved and even rather narrowly solved at that. I'm quite grateful that for once there IS actually something matching the requirements and therefore I don't have to cobble it together myself from bits and pieces.

The above being said, you as the user of EuCrypt are *expected* to make your own decisions. Consequently, there really is nothing stopping you from using whatever you want as your own "source of randomness", be it actually random or pseudo-random or straight non-random or anything in between, why would EuCrypt care? Just change existing knobs in EuCrypt (see below) or directly replace the relevant methods (mainly get_random_octets, discussed below) with your own code and that's what will get used. For my own needs however, I'll use FG and moreover, one that I actually bought (even with some additional cost) and tested myself.

To keep this as clear as possible, let's start with the tiniest part, namely a single lonely knob for EuCrypt, giving the name of the entropy source to use and defined in knobs.h, as part of the smg_rsa component of EuCrypt (eucrypt/smg_rsa/include/knobs.h):


#define ENTROPY_SOURCE "/dev/ttyUSB0"

#endif /*SMG_RSA_KNOBS_H*/

As it is quite clear from above, current version of EuCrypt assumes that an FG is connected simply to an USB port that can be accessed via /dev/ttyUSB0. If the device dev path on your machine is different, change the value of this knob accordingly. If your FG is connected to something other than an USB port (such as a serial port for instance), you'll need to change more than this knob here.

Now for the signatures of the functions that will provide access to the specified source of randomness, have a look at smg_rsa.h:

/* smg_rsa.h
 * S.MG, 2017

#ifndef SMG_RSA_H
#define SMG_RSA_H

#include "mpi.h"
#include "knobs.h"


 * Opens and configures (as per FG requirements) the specified entropy source (e.g. "/dev/ttyUSB0")
 * @param source_name the name of the file to open (e.g. "/dev/ttyUSB0")
 * @return the descriptor of the open file when successful; negative value otherwise
int open_entropy_source(char* source_name);

 * Returns noctets random octets (i.e. 8*noctets bits in total) as obtained from EuCrypt's preferred source.
 * Preferred source is defined in knobs.h as ENTROPY_SOURCE and should be a TRNG (e.g. Fuckgoats).
 * @param nboctets the length of desired random sequence, in octets
 * @param out pointer to allocated memory space for the requested random noctets;
 * NB: this method does NOT allocate space!
 * @return the actual number of octets that were obtained from the currently configured entropy source
 * (this is equal to noctets on successful read of required noctets)
int get_random_octets(int noctets, unsigned char *out);

/* Returns noctets random octets as obtained from the specified "from" source;
 * NB: the "from" source is considered to be the handle of an already opened stream;
 * This method will simply attempt to read from the source as needed!
 * @param noctets the length of desired random sequence, in octets
 * @param out pointer to allocated memory space for the requested random octets;
 * NB: this method does NOT allocate space!
 * @param from handle of an already opened entropy source - this method will just READ from it as needed
 * @return the actual number of octets that were obtained
int get_random_octets_from(int noctets, unsigned char *out, int from);

#endif /*SMG_RSA*/

The difference between the two functions that retrieve a specified number of octets is that one opens and closes the source itself (hence, every time it is called) while the second one simply reads the specified number of bits from an already opened source that is given as argument. Note that the function opening and closing the source itself uses the other one for the actual reading. The reason for providing both functions is simply the fact that opening/closing the source can easily be a significant overhead when reading only a few octets at a time.

To configure and open the source, the function used is open_entropy_source. This function provides a way to obtain a handler of the opened source that can then be passed repeatedly to the function that retrieves octets, as needed.

One important aspect to note above is that the two functions that retrieve random octets do NOT allocate memory for the output. They assume that the caller has allocated enough memory and provided a valid pointer. The reason for this is two-fold: first, as a design principle I prefer to keep allocation and de-allocation of memory in the same function as much as possible without passing responsibility around; second, for RSA purpose, the memory allocation is often done with specific MPI methods and using those (or indeed being aware of them) is outside the scope of this bit of code as it has nothing at all to do with getting random bits.

The actual implementations of the above functions are found in truerandom.c. UPDATED on 4 January 2018: this code has been changed, see Chapter 4 and corresponding .vpatch!

#include < stdio.h>
#include < stdlib.h>
#include < string.h>

#include < fcntl.h>
#include < unistd.h>
#include < termios.h>
#include < errno.h>

#include "smg_rsa.h"

int set_usb_attribs(int fd, int speed) {
	struct termios tty;
	if (tcgetattr(fd, &tty) < 0) {
		return -1;

	//input and output speeds
	cfsetospeed(&tty, (speed_t)speed);
	cfsetispeed(&tty, (speed_t)speed);

	tty.c_cflag |= (CLOCAL | CREAD);	//ignore modem controls
	tty.c_cflag &= ~CSIZE;
	tty.c_cflag |= CS8;			//8 bit characters
	tty.c_cflag &= ~PARENB;	//no parity bit
	tty.c_cflag &= ~CSTOPB;	//only need 1 stop bit
	tty.c_cflag &= ~CRTSCTS;	//no hardware flow control

	//non-canonical mode
	tty.c_cflag &= ~(ECHO | ECHONL | ICANON | ISIG | IEXTEN);
	tty.c_cflag &= ~OPOST;

	//read at least one octet at a time; timeout 1 tenth of second between octets read
	tty.c_cc[VMIN] = 1;
	tty.c_cc[VTIME] = 1;

	if (tcsetattr(fd, TCSANOW, &tty) != 0)
		return -1;

	return 0;

int open_entropy_source(char* source_name) {
	int in, err;

	in = open(source_name, O_RDONLY | O_NOCTTY | O_NDELAY);
	if (in == -1) {
		printf("ERROR: failure to open entropy source %s: %sn", source_name, strerror(errno));
		return in;	//failed to access entropy source

	fcntl(in, F_SETFL, 0);

	err = set_usb_attribs(in, B115200);
	if (err==-1) {
		printf("Error setting attributes on %s: %sn", source_name, strerror(errno));
		return err;

	return in;	//source opened, return its descriptor

int get_random_octets_from(int noctets, unsigned char *out, int from) {

	int nread;
	int total = 0;

	while (total < noctets) {
           nread = read(from, out+total, noctets-total);
           //on interrupt received just try again
           if (nread == -1 && errno == EINTR)
           //on error condition abort
           if (nread == -1 || nread == 0) {
             printf("Error reading from entropy source %s: %sn", ENTROPY_SOURCE, strerror(errno));
             return total; //total read so far
           if (nread > 0)
		total = total + nread;
	return total;	//return number of octets read

int get_random_octets(int noctets, unsigned char *out) {
	int in;
	int nread = 0;

	in = open_entropy_source(ENTROPY_SOURCE);
	if (in > 0) {
		nread = get_random_octets_from(noctets, out, in);
	return nread;

As it can be seen above, a significant part of the code is simply for configuring the device. Most importantly, the configuration aims to turn OFF all flow control and to set the baud rate as required by FG. While this should work under most versions of Linux, be aware of the known pl2303 vs pl2303x issue with some connectors on older systems.

Note that an incorrectly configured device will simply block and since the functions above are written to always wait for the full number of bits required, they will *also* block in this case.

Finally, a basic test in tests/test.c:

#include "smg_rsa.h"

#include < stdlib.h>
#include < time.h>

void err(char *msg)
  fprintf(stderr, "%sn", msg);

void time_entropy_source(int nruns, int noctets) {
	unsigned char buffer[noctets];
	int read, i;
	struct timespec tstart, tend;
	long int diff;

	clock_gettime(CLOCK_MONOTONIC, &tstart);
	for (i=0; i < nruns; i++) {
		read = get_random_octets(noctets,buffer);
		if (read != noctets)
			err("Failed reading from entropy source!");
	clock_gettime(CLOCK_MONOTONIC, &tend);

	diff = tend.tv_sec-tstart.tv_sec;
	double kbps = (nruns*noctets) / (diff*1000.0);
	printf("ENTROPY source timing: %d kB in %ld seconds, at an average speed of %f kB/s over %d runs of %d octets eachn", nruns*noctets, diff, kbps, nruns, noctets);

int main(int ac, char **av)
	int nruns;

	if (ac<2) {
		printf("Usage: %s number_of_runsn", av[0]);
		return -1;
	nruns = atoi(av[1]);

	printf("Timing entropy source...n");

  return 0;

For testing, simply plug into an USB port your (previously audited, hopefully) FG, compile everything and then run the test with as many runs as you want. When it's done (so after a while, depending on how many runs you asked for), it should print on screen the speed at which it obtained the random bits from FG.

Following the sad realisation that I can't currently safely alter folder structure under V, I created a genesis patch for EuCrypt containing the intended structure (more like: writing in stone the intended structure) and it's on top of this that each chapter of EuCrypt adds new content by means of vpatches. Here you have everything you need so far:

Note that the patch in Chapter 1 is NOT needed anymore directly for EuCrypt (it still is valid though in itself and as a further snip on the standalone mpi so I will keep it where it is). The changes that patch makes are already included in the version of mpi that ch1_mpi.vpatch simply bring into EuCrypt.

In the next chapter, since we have already an MPI implementation as well as a way to access true randomness, we can get ever so closer to the actual RSA itself, so stay tuned!

  1. I'll likely expand on this as soon as I get to the actual implementation of RSA so in the next few chapters. 

December 7, 2017

Introducing EuCrypt

Filed under: EuCrypt — Diana Coman @ 6:09 p.m.

EuCrypt is a self-contained library that Eulora server will use for its communication needs. EuCrypt has the following 5 main components:

  1. smg-comm - the implementation of the basic client-server communication protocol. This makes use of all the other components, namely:
  2. smg-serpent - the symmetric cipher that is used by smg-comm for everyday message exchanges between Eulora clients and server.
  3. smg-keccak - the implementation of the keccak function and sponge construction, used by smg-comm mainly as part of the data padding scheme.
  4. smg-rsa - the implementation of the RSA encryption algorithm using a source of true randomness and the sane-mpi implementation of big number arithmetics:
  5. sane-mpi - the implementation of big number arithmetics, as extracted from GnuPG 1.4.10 by Stanislav Datskovskiy.

The above structure is only meant to give you a high-level, top-down idea of what EuCrypt is made of. However, to actually understand EuCrypt at all1, you'll need to read the code and discussion for each of the above components and so I will start from bottom-up with the code + discussion of sane-mpi and then proceed to add parts and pieces until we get the whole library. Note that some of the components above are bigger than others and therefore I will split the discussion of those in several installments, so expect overall more than 5 posts on this. Comments and questions are welcome at any time and on any of the parts, so don't be shy.

Chapter 1: sane-mpi

This component is implemented in C and offers support for storing and working with arbitrary large integers (multi precision integers or MPIs as we'll call them from now on). The code is rather messy but at the moment there isn't really anything better available and as it stands, this sane-mpi is at least readable now - mainly through the efforts of Stanislav Datskovskiy who extracted it from the big ball of mess that is GnuPG 1.4.10. I've made only a small further snip to his version, discarding a set of methods for accessing specific parts of an MPI. While such methods could conceivably be useful at some point, the point is that EuCrypt at the moment does *not* need them and moreover the existing implementation was so ugly as to need a re-write in any case. In short those parts failed to be either useful or at the very least sane, so they are no more.

There are lots of very useful comments all through the code of sane-mpi so the best option is really to actually read each file, at least once. There's no point in repeating those comments in here. I'll focus instead on a few aspects that I think are most relevant to EuCrypt and its purpose.

First issue is related to how and where MPIs are stored, because EuCrypt uses MPIs for storing private RSA keys. In principle, sane-mpi offers methods for allocating secure memory for any given MPI (see secmem.c and memory.c). However, a correct description of the service is that sane-mpi will attempt to allocate secure memory, with results depending on the machine and operating system you are running it on.

Note also that it pays to be mindful when performing operations with MPIs that are a mix of secure and insecure since undesired leaks may happen at intermediate steps. Sane-mpi attempts to avoid this by allocating additional secure memory space for intermediate results when one of the operands is in secure memory (see for instance mpi_mul method in mpi-mul.c). However, this (among others) means also that a simple arithmetical operation involving 2 MPIs will actually differ in execution depending not only on the values of the 2 numbers but also on where they are each stored. Moreover, this sort of side-effect memory allocation happens in fact in more than one place in sane-mpi.

Second issue refers to the way in which MPI operations are implemented in sane-mpi. Essentially following the implementation of one single arithmetical operation is not always very straightforward as it can easily lead one through several files (even when leaving aside the side-effect memory allocation issues and focusing instead only on the actual operations performed). From the point of view of fully and comfortably holding sane-mpi in head, this is rather unfortunate but it is what it is.

Since EuCrypt uses by necessity not only the multiplication and exponentiation but also direct bit setting of MPIs, I'll highlight mpi-bit.c as another file that should be studied in more detail. In particular, the method mpi_set_highbit can be confusing as it does a bit more than the name suggests: it sets the indicated bit but it ALSO clears all bits above the one indicated. Similarly, mpi_clear_highbit clears the indicated bit AND all bits above it. By contrast, mpi_set_bit and mpi_clear_bit are the more straightforward versions, simply setting or clearing, respectively, the indicated bit and nothing more.

To download, build and test sane-mpi as will be used in EuCrypt you will need:

If you need help with V, there are is a gentle introduction courtesy of Ben Vulpes.

In the next chapter I'll proceed to introduce parts of EuCrypt's own RSA that puts all this sane-mpi to some actual use. Stay tuned!



  1. Note that you will be effectively trusting this library with your data and money whenever you play Eulora so better understand it before playing the game. 

Theme and content by Diana Coman