Ossasepia

January 11, 2018

EuCrypt Chapter 5: C = M ^ e mod n

Filed under: EuCrypt — Diana Coman @ 4:21 p.m.

~ This is part of the EuCrypt series. Start with Introducing EuCrypt. ~

Having a true random number generator (trng) and on top of it a true random prime number generator (trpng) from previous chapters, I can now finally touch on RSA1 itself: this chapter adds a way to generate RSA keys and to actually use them directly to encrypt/decrypt a chunk of octets as they are given2. Future chapters will build further on this by adding for instance slicing of messages into adequate blocks, padding and hashing. For now though, I'll focus strictly on RSA itself, meaning on obtaining the components of a working RSA key pair and then using those for the two main RSA operations of encryption and decryption. Specifically:

  • RSA public key: n, e, where n is called the modulus and e is called the public exponent. Most importantly, n is obtained as a product of 2 secret primes (usually called p and q) that are randomly chosen.
  • RSA private key: n, d, where n is the same one as above in the public key and d is essentially an inverse of e that depends not only on e itself but also on the hidden p and q.
  • encryption: C = M ^ e mod n, where C is the resulting encrypted message (cipher), M is the message in plain-text, e is the public exponent of an RSA key pair and n is the public modulus of the same key pair.
  • decryption: M = C ^ d mod n, where M, C and n are the very same as in the encryption formula above and d is the secret exponent of the same key pair as above (the inverse of e, essentially).

Note that mathematically encryption and decryption are really the very same operation: an exponentiation modulo n (even same n). However, in practice the two operations need to be treated quite separately because they involve knowledge of different parts and that is the whole point from a cryptographic perspective: encryption means using someone's public key, hence the assumed knowledge is of e and n, nothing else; decryption however means using one's own key, hence there is full knowledge of not only e and n but also the hidden parts, p and q. This difference of knowledge translates at implementation time in a different route taken to perform those same exponentiations modulo n: while encryption has to proceed directly as it is, decryption can take a faster route using the Chinese Remainder Theorem (CRT).

It's worth mentioning that the use of CRT at this stage is considered acceptable for Eulora's needs. However, you need to make your own decision on this. Note that a truly non-leaking RSA requires effectively constant-time operations at all levels, so you'll need to throw away more than CRT itself if that's what you are aiming for - you'll probably want to have a look at FFA in such case.

Generating a RSA key pair in EuCrypt consists in the following steps:

  • use the true random prime number generator from Chapter 4 to obtain 2 random primes, p and q, of 2048 bits (256 octets) each. This value is precisely half of the intended key length, as per current TMSR RSA specification. Note that both primes will have top 2 bits set to 1 precisely to ensure that their product is indeed 4096 bits in length in all cases. See discussion in Chapter 4 for more details on the working of the generator itself.
  • Compare p and q and swap them if needed so that p is always less than q. Note that this is NOT a requirement of RSA itself but it is needed basically as helper for the use of the Chinese Remainder Theorem (CRT) to speed-up decryption, see next step and then the decryption part itself.
  • Calculate u so that u * p = 1 (mod q). This is effectively the inverse of p, mod q. Due to the previous step, p here is always less than q. As with the previous step, this is not required by RSA itself - it's simply a value that is used at decryption to speed up the calculations by means of using CRT.
  • Calculate the modulus n of the key pair, as n = p * q.
  • Calculate the Euler totient3 of n, phi = (p - 1) * (q - 1)
  • Choose another random prime between 3 and phi. This is e (3 < e < phi), the public exponent of the newly generated key pair. A few comments here:

    • The public exponent is not a fixed value and there really is no defensible reason why it should be fixed at the level of an RSA implementation meant to really serve the user4 rather than anyone else. So EuCrypt does not fix the public exponent to any value. By contrast, GnuPG fixes the public exponent to 65537 as it really knows better than you what you need and even what you could possibly want, what want and what choice, why should you even think of any such things? You as user of EuCrypt can of course fix anything you want, public exponent included, IF you want to.
    • Mathematically, RSA requires e to be merely co-prime with (p-1)*(q-1), NOT necessarily prime in itself. However, EuCrypt chooses here the stronger constraint of e strictly prime, as per previous discussion of the TMSR spec.
    • The chosen size of e is the same as that of p and q. This means that the public exponent will likely be large but at the same time the corresponding private exponent will not become tiny either. For more details see the same previous discussion of the TMSR spec, as linked above.
    • Because e is obtained the same way as any other prime in EuCrypt, it will also always have top 2 bits and bottom 1 bit (so 3 bits in total) all set to 1. Combined with the fact that minimum length accepted by the prime generator is 1 octet (8 bits), it follows that, as long as the current generator of primes is used, the e in EuCrypt will be in fact at all times at least 193 (1+64+128) even if the length of e is lowered (with current length of 256 octets, that minimum is significantly higher). Nevertheless, this higher limit is imposed by the current generator, not by the RSA algorithm itself and for this reason the check here is more lenient as it aims to reflect requirements of RSA rather than other characteristics of its current surroundings - basically the generator can be changed at a later time without having to touch the RSA code itself.
    • The search for a suitable e is iterative: if a prime happens to fail the boundary checks (so it's either too small or too large) then it is discarded and another prime is generated.
  • calculate the private exponent, d, such that e * d = 1 mod phi (the inverse of e, mod phi).

The implementation of the above relies on two new structures that hold the various parts of a public and private RSA key, respectively. Those two new structures are added by the .vpatch for this chapter to eucrypt/smg_rsa/include/smg_rsa.h:

typedef struct {
    MPI n;      /* modulus */
    MPI e;      /* public exponent */
} RSA_public_key;

typedef struct {
    MPI n;      /* public modulus */
    MPI e;      /* public exponent */
    MPI d;      /* private exponent: e*d=1 mod phi */
    MPI p;      /* prime  p */
    MPI q;      /* prime  q */
    MPI u;      /* inverse of p mod q */
} RSA_secret_key;

In the same file eucrypt/smg_rsa/include/smg_rsa.h, there is a further addition of the signature of the method that generates a RSA key pair:

/*********rsa.c*********/
/*
 * Generates a pair of public+private RSA keys using directly the entropy source
 * specified in eucrypt/smg_rsa/include/knobs.h
 *
 * ALL RSA keys are 4096 bits out of 2 2048 bits primes, as per TMSR spec.
 *
 * @param sk a fully-allocated structure to hold the generated keypair (secret
key structure holds all the elements anyway, public key is a subset of this)
 *
 * NB: this procedure does NOT allocate memory for components in sk!
 *     caller should ALLOCATE enough memory for all the MPIs in sk
 * Precondition:
 * MPIs in sk have known allocated memory for the nlimbs fitting their TMSR size
 */
void gen_keypair( RSA_secret_key *sk );

It's worth noting what the comments shout at you there as loud as they can: as in previous code, I avoid separating allocation of memory from de-allocation and for this reason, it's the caller's responsibility to allocate memory for the things they want to use (in this case the MPIs that make up the key pair).

If you wonder why the gen_keypair method takes only one argument rather than two (i.e. only a secret key structure rather than a secret key AND a public key structure) scroll back and look again at the two structures: the public key is really but a subset of the secret key, so the secret key structure effectively holds the whole "pair" of keys by itself.

The actual body of the gen_keypair method lives in a new file, eucrypt/smg_rsa/rsa.c:

*
 * An implementation of TMSR RSA
 * S.MG, 2018
 */

#include "smg_rsa.h"
#include 

void gen_keypair( RSA_secret_key *sk ) {
  /* precondition: sk is not null */
  assert(sk != NULL);

  /* precondition: enough memory allocated, corresponding to key size */
  int noctets_pq = KEY_LENGTH_OCTETS / 2;
  unsigned int nlimbs_pq = mpi_nlimb_hint_from_nbytes( noctets_pq);
  unsigned int nlimbs_n = mpi_nlimb_hint_from_nbytes( KEY_LENGTH_OCTETS);
  assert( mpi_get_alloced( sk->n) >= nlimbs_n);
  assert( mpi_get_alloced( sk->p) >= nlimbs_pq);
  assert( mpi_get_alloced( sk->q) >= nlimbs_pq);

  /* helper variables for calculating Euler's totient phi=(p-1)*(q-1) */
  MPI p_minus1 = mpi_alloc(nlimbs_pq);
  MPI q_minus1 = mpi_alloc(nlimbs_pq);
  MPI phi = mpi_alloc(nlimbs_n);

  /* generate 2 random primes, p and q*/
  /* gen_random_prime sets top 2 bits to 1 so p*q will have KEY_LENGTH bits */
  /* in the extremely unlikely case that p = q, discard and generate again */
  do {
    gen_random_prime( noctets_pq, sk->p);
    gen_random_prime( noctets_pq, sk->q);
  } while ( mpi_cmp( sk->p, sk->q) == 0);

  /* swap if needed, to ensure p < q for calculating u */
  if ( mpi_cmp( sk->p, sk->q) > 0)
    mpi_swap( sk->p, sk->q);

  /* calculate helper for Chinese Remainder Theorem:
      u = p ^ -1 ( mod q )
     this is used to speed-up decryption.
  */
  mpi_invm( sk->u, sk->p, sk->q);

  /* calculate modulus n = p * q */
  mpi_mul( sk->n, sk->p, sk->q);

  /* calculate Euler totient: phi = (p-1)*(q-1) */
  mpi_sub_ui( p_minus1, sk->p, 1);
  mpi_sub_ui( q_minus1, sk->q, 1);
  mpi_mul( phi, p_minus1, q_minus1);

  /* choose random prime e, public exponent, with 3 < e < phi */
  /* because e is prime, gcd(e, phi) is always 1 so no need to check it */
  do {
    gen_random_prime( noctets_pq, sk->e);
  } while ( (mpi_cmp_ui(sk->e, 3) < 0) || (mpi_cmp(sk->e, phi) > 0));

  /* calculate private exponent d, 1 < d < phi, where e * d = 1 mod phi */
  mpi_invm( sk->d, sk->e, phi);

  /*  tidy up: free locally allocated memory for helper variables */
  mpi_free(phi);
  mpi_free(p_minus1);
  mpi_free(q_minus1);
}

As usual, the method checks as well as it can its own preconditions that reduce in this case to some checks on the known size of allocated memory for some of the MPIs. However, these checks don't change the fact that it still is the caller's responsibility to allocate memory for *all* MPIs in the structure and to allocate *enough* such memory for each of them, too. The unfortunate souls who have some knowledge of the mpi lib might interject at this point with the observation that mpi methods tend to re-allocate memory whenever they deem it necessary without any concern whatsoever about whether it is their role to do so. This is true unfortunately and it's not something I'm going to sink time into fixing at this moment. Still, it's not in itself a valid excuse or otherwise a "reason" to fail to allocate the correct amount of memory from the start. Don't add to existing garbage lest you'll get eaten by rats later and don't get lazy just because there's nobody looking right now.

Encryption with a previously generated public RSA key is a simple exponentiation modulo n. However, as usual, digging through the parts of the mpi lib that are needed for this reveals that the mpi_powm method that does this exponentiation is not only rather gnarly but also unable to handle properly the corner case when the MPIs for storing the input (the message to encrypt so the MPI raised to power e) and the output (the result of the encryption, hence the result of the exponentiation) are the same. The "solution" of GnuPG on this is -again, as usual- to paper it over and force the RSA layer to take care to avoid this case by allocating memory and using basically a temporary copy of the original input MPI. As you can probably tell by now, I don't like this approach at all because it neither solves the problem nor flags it as such, clearly5. Instead, it avoids the problem but it does so in the wrong place (why should the encryption operation allocate new memory?) and with the unfortunate effect of hiding it from anything that builds on top of the RSA layer.

Since fixing this wobble of the mpi method promises to be more time-consuming than it's currently worth it, EuCrypt takes the second-best option on it: the encryption method clearly states as its precondition that input and output have to be two different MPIs; the reason for this is given too, so that the problem is documented clearly, not hidden; the method then checks this precondition and it aborts execution if the check fails; whenever the precondition is met, the method simply does precisely what it promised (i.e. the exponentiation) and nothing more. The signature of this method is in eucrypt/smg_rsa/include/smg_rsa.h:

/****************
 * Public key operation. Encrypt input with pk and store result into output.
 *
 *  output = input^e mod n , where e,n are elements of pkey.
 * NB: caller should allocate *sufficient* memory for output to hold the result.
 * NB: NO checks are made on input!
 *
 * @param output MPI with enough allocated memory to hold result of encryption
 * @param input MPI containing content to encrypt; it *has to be* different from
output!
 * @param pk the public key that will be used to encrypt input
 *
 * Precondition:
 *  output != input
 * Output and input have to be two distinct MPIs because of the sorry state of
the underlying mpi lib that can't handle properly the case when those are the
same.
 */
void public_rsa( MPI output, MPI input, RSA_public_key *pk );

The implementation of public_rsa is in eucrypt/smg_rsa/include/rsa.c and it barely has 3 lines in total, comment line included:

void public_rsa( MPI output, MPI input, RSA_public_key *pk ) {

  /* mpi_powm can't handle output and input being same */
  assert (output != input);

  mpi_powm( output, input, pk->e, pk->n );
}

Decrypting with a previously generated RSA private key is mathematically the same operation as encrypting. However, given the additional information available about the modulus (namely its factors, p and q), the implementation can take advantage of CRT to perform the same calculation faster (~4 times faster). This is implemented in EuCrypt in the method secret_rsa with signature in eucrypt/smg_rsa/include/smg_rsa.h:

/****************
 * Secret key operation. Decrypt input with sk and store result in output.
 *
 *  output = input^d mod n , where d, n are elements of skey.
 *
 * This implementation uses the Chinese Remainder Theorem (CRT):
 *
 *      out1   = input ^ (d mod (p-1)) mod p
 *      out2   = input ^ (d mod (q-1)) mod q
 *      h      = u * (out2 - out1) mod q
 *      output = out1 + h * p
 *
 * where out1, out2 and h are intermediate values, d,n,p,q,u are elements of
skey. By using CRT, encryption is *faster*. Decide for yourself if this fits
your needs though!
 * NB: it is the caller's responsibility to allocate memory for output!
 * NB: NO checks are made on input!
 *
 * @param output MPI with enough allocated memory to hold result of decryption
 * @param input MPI containing content to decrypt
 * @param sk the secret key that will be used to decrypt input
 */
void secret_rsa( MPI output, MPI input, RSA_secret_key *sk );

And the implementation of secret_rsa, in eucrypt/smg_rsa/rsa.c, with comments literally at every step:

void secret_rsa( MPI output, MPI input, RSA_secret_key *skey ) {
  /* at its simplest, this would be input ^ d (mod n), hence:
   *    mpi_powm( output, input, skey->d, skey->n );
   * for faster decryption though, we'll use CRT and Garner's algorithm, hence:
   *        u = p ^ (-1) (mod q) , already calculated and stored in skey
   *       dp = d mod (p-1)
   *       dq = d mod (q-1)
   *       m1 = input ^ dp (mod p)
   *       m2 = input ^ dq (mod q)
   *        h = u * (m2 - m1) mod q
   *   output = m1 + h * p
   * Note that same CRT speed up isn't available for encryption because at
encryption time not enough information is available (only e and n are known).
   */
  /* allocate memory for all local, helper MPIs */
  MPI p_minus1 = mpi_alloc( mpi_get_nlimbs( skey->p) );
  MPI q_minus1 = mpi_alloc( mpi_get_nlimbs( skey->q) );
  int nlimbs   = mpi_get_nlimbs( skey->n ) + 1;
  MPI dp       = mpi_alloc( nlimbs );
  MPI dq       = mpi_alloc( nlimbs );
  MPI m1       = mpi_alloc( nlimbs );
  MPI m2       = mpi_alloc( nlimbs );
  MPI h        = mpi_alloc( nlimbs );

  /* p_minus1 = p - 1 */
  mpi_sub_ui( p_minus1, skey->p, 1 );

  /* dp = d mod (p - 1) aka remainder of d / (p - 1) */
  mpi_fdiv_r( dp, skey->d, p_minus1 );

  /* m1 = input ^ dp (mod p) */
  mpi_powm( m1, input, dp, skey->p );

  /* q_minus1 = q - 1 */
  mpi_sub_ui( q_minus1, skey->q, 1 );

  /* dq = d mod (q - 1) aka remainder of d / (q - 1) */
  mpi_fdiv_r( dq, skey->d, q_minus1 );

  /* m2 = input ^ dq (mod q) */
  mpi_powm( m2, input, dq, skey->q );

  /* h = u * ( m2 - m1 ) mod q */
  mpi_sub( h, m2, m1 );
  if ( mpi_is_neg( h ) )
    mpi_add ( h, h, skey->q );
  mpi_mulm( h, skey->u, h, skey->q );

  /* output = m1 + h * p */
  mpi_mul ( h, h, skey->p );
  mpi_add ( output, m1, h );

  /* tidy up */
  mpi_free ( p_minus1 );
  mpi_free ( q_minus1 );
  mpi_free ( dp );
  mpi_free ( dq );
  mpi_free ( m1 );
  mpi_free ( m2 );
  mpi_free ( h );

}

And now that we have everything in place to actually generate RSA key pairs and proceed to encrypt and decrypt, a few new tests and timing methods are needed in eucrypt/smg_rsa/tests/tests.c:

/* Test encryption+decryption on noctets of random data, using sk
 * Output is written to file.
 */
void test_rsa_keys( RSA_secret_key *sk, unsigned int noctets, FILE *file ) {
  RSA_public_key pk;
  MPI test = mpi_alloc ( mpi_nlimb_hint_from_nbytes (noctets) );
  MPI out1 = mpi_alloc ( mpi_nlimb_hint_from_nbytes (noctets) );
  MPI out2 = mpi_alloc ( mpi_nlimb_hint_from_nbytes (noctets) );

  pk.n = mpi_copy(sk->n);
  pk.e = mpi_copy(sk->e);
  unsigned char *p;
  p = xmalloc(noctets);

  fprintf(file, "TEST encrypt/decrypt on %d octets of random datan", noctets);
  fflush(file);
  if (get_random_octets( noctets, p) == noctets) {
    mpi_set_buffer( test, p, noctets, 0 );

    fprintf(file, "TEST data:n");
    mpi_print(file, test, 1);
    fprintf(file, "n");
    fflush(file);

    public_rsa( out1, test, &pk );
    secret_rsa( out2, out1, sk );

    fprintf(file, "ENCRYPTED with PUBLIC key data:n");
    mpi_print(file, out1, 1);
    fprintf(file, "n");
    fflush(file);

    fprintf(file, "DECRYPTED with SECRET key:n");
    mpi_print(file, out2, 1);
    fprintf(file, "n");
    fflush(file);

    if( mpi_cmp( test, out2 ) )
      fprintf(file, "FAILED: RSA operation: public(secret) failedn");
    else
      fprintf(file, "PASSED: RSA operation: public(secret) passedn");
    fflush(file);

    secret_rsa( out1, test, sk );
    public_rsa( out2, out1, &pk );
    if( mpi_cmp( test, out2 ) )
      fprintf(file, "FAILED: RSA operation: secret(public) failedn");
    else
      fprintf(file, "PASSED: RSA operation: secret(public) passedn");
  }
  else
    fprintf(file, "FAILED: not enough bits returned from entropy sourcen");

  fflush(file);
  xfree(p);
  mpi_free( pk.n);
  mpi_free( pk.e);

  mpi_free( test );
  mpi_free( out1 );
  mpi_free( out2 );
}

void test_rsa( int nruns, FILE *fkeys, FILE *fout) {
  RSA_secret_key sk;
  int noctets = KEY_LENGTH_OCTETS;
  int noctets_pq = noctets / 2;
  int nlimbs = mpi_nlimb_hint_from_nbytes(noctets);
  int nlimbs_pq = mpi_nlimb_hint_from_nbytes(noctets_pq);
  int i;

  sk.n = mpi_alloc(nlimbs);
  sk.e = mpi_alloc(nlimbs);
  sk.d = mpi_alloc(nlimbs);
  sk.p = mpi_alloc(nlimbs_pq);
  sk.q = mpi_alloc(nlimbs_pq);
  sk.u = mpi_alloc(nlimbs_pq);

  printf("TEST RSA key generation and use with %d runsn", nruns);
  fflush(stdout);

  for (i = 0;i < nruns; i++) {
    gen_keypair(&sk);
    printf(".");
    fflush(stdout);

    mpi_print(fkeys, sk.n, 1);
    fwrite("n", sizeof(char), 1, fkeys);

    mpi_print(fkeys, sk.e, 1);
    fwrite("n", sizeof(char), 1, fkeys);

    mpi_print(fkeys, sk.d, 1);
    fwrite("n", sizeof(char), 1, fkeys);

    mpi_print(fkeys, sk.p, 1);
    fwrite("n", sizeof(char), 1, fkeys);

    mpi_print(fkeys, sk.q, 1);
    fwrite("n", sizeof(char), 1, fkeys);

    mpi_print(fkeys, sk.u, 1);
    fwrite("n", sizeof(char), 1, fkeys);

    test_rsa_keys(&sk, noctets_pq, fout);
    printf("*");
    fflush(stdout);
  }

  mpi_free(sk.n);
  mpi_free(sk.e);
  mpi_free(sk.d);
  mpi_free(sk.p);
  mpi_free(sk.q);
  mpi_free(sk.u);

}

void test_rsa_exp() {
  MPI msg = mpi_alloc(0);
  MPI expected = mpi_alloc(0);
  MPI result;

  RSA_public_key pk;
  pk.n = mpi_alloc(0);
  pk.e = mpi_alloc(0);

  printf("TEST verify of rsa exponentiation on input data: n");

  mpi_fromstr(msg, "0x
5B6A8A0ACF4F4DB3F82EAC2D20255E4DF3E4B7C799603210766F26EF87C8980E737579
EC08E6505A51D19654C26D806BAF1B62F9C032E0B13D02AF99F7313BFCFD68DA46836E
CA529D7360948550F982C6476C054A97FD01635AB44BFBDBE2A90BE06F7984AC8534C3
8613747F340C18176E6D5F0C10246A2FCE3A668EACB6165C2052497CA2EE483F4FD8D0
6A9911BD97E9B6720521D872BD08FF8DA11A1B8DB147F252E4E69AE6201D3B374B171D
F445EF2BF509D468FD57CEB5840349B14C6E2AAA194D9531D238B85B8F0DD352D1E596
71539B429849E5D965E438BF9EFFC338DF9AADF304C4130D5A05E006ED855F37A06242
28097EF92F6E78CAE0CB97");

  mpi_fromstr(expected, "0x
1FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF003051300
D0609608648016503040203050004406255509399A3AF322C486C770C5F7F6E05E18FC
3E2219A03CA56C7501426A597187468B2F71B4A198C807171B73D0E7DBC3EEF6EA6AFF
693DE58E18FF84395BE");
  result = mpi_alloc( mpi_get_nlimbs(expected) );

  mpi_fromstr(pk.n, "0x
CDD49A674BAF76D3B73E25BC6DF66EF3ABEDDCA461D3CCB6416793E3437C7806562694
73C2212D5FD5EED17AA067FEC001D8E76EC901EDEDF960304F891BD3CAD7F9A335D1A2
EC37EABEFF3FBE6D3C726DC68E599EBFE5456EF19813398CD7D548D746A30AA47D4293
968BFBAFCBF65A90DFFC87816FEE2A01E1DC699F4DDABB84965514C0D909D54FDA7062
A2037B50B771C153D5429BA4BA335EAB840F9551E9CD9DF8BB4A6DC3ED1318FF3969F7
B99D9FB90CAB968813F8AD4F9A069C9639A74D70A659C69C29692567CE863B88E191CC
9535B91B417D0AF14BE09C78B53AF9C5F494BCF2C60349FFA93C81E817AC682F0055A6
07BB56D6A281C1A04CEFE1");

  mpi_fromstr( pk.e, "0x10001");

  mpi_print( stdout, msg, 1);
  printf("n");

  public_rsa( result, msg, &pk);
  if ( mpi_cmp( result, expected) != 0 )
    printf( "FAILEDn");
  else
    printf( "PASSEDn");

  printf("Expected:n");
  mpi_print( stdout, expected, 1);
  printf("n");

  printf("Obtained:n");
  mpi_print( stdout, result, 1);
  printf("n");

  mpi_free( pk.n );
  mpi_free( pk.e );
  mpi_free( msg );
  mpi_free( expected );
  mpi_free( result );
}

void time_rsa_gen( int nruns ) {
  struct timespec tstart, tend;
  long int diff;
  int i;

  RSA_secret_key sk;
  int noctets = KEY_LENGTH_OCTETS;
  int noctets_pq = noctets / 2;
  int nlimbs = mpi_nlimb_hint_from_nbytes(noctets);
  int nlimbs_pq = mpi_nlimb_hint_from_nbytes(noctets_pq);
  sk.n = mpi_alloc(nlimbs);
  sk.e = mpi_alloc(nlimbs);
  sk.d = mpi_alloc(nlimbs);
  sk.p = mpi_alloc(nlimbs_pq);
  sk.q = mpi_alloc(nlimbs_pq);
  sk.u = mpi_alloc(nlimbs_pq);

  clock_gettime(CLOCK_MONOTONIC, &tstart);
  for (i = 0;i < nruns; i++) {
    gen_keypair(&sk);
  }
  clock_gettime(CLOCK_MONOTONIC, &tend);

  diff = tend.tv_sec-tstart.tv_sec;
  printf("TOTAL: %ld seconds for generating %d key pairsn", diff, nruns);
  printf("Average (%d runs): %f seconds per TMSR RSA key pair.n",
        nruns, diff / (1.0*nruns));
  mpi_free(sk.n);
  mpi_free(sk.e);
  mpi_free(sk.d);
  mpi_free(sk.p);
  mpi_free(sk.q);
  mpi_free(sk.u);
}

The testing suite grows as usual much more than the code itself. To put those new tests to use, there are a few more cases in the main function in tests.c:

    case 6:
      fk = fopen("keys.asc", "a");
      if ( fk == NULL )
        err("Failed to open file keys.asc!");
      fout = fopen("check_keys.asc", "a");
      if ( fout == NULL ) {
        fclose(fk);
        err("Failed to open file keys_check.asc!");
      }
      test_rsa(nruns, fk, fout);
      fclose(fk);
      fclose(fout);
      break;
    case 7:
      test_rsa_exp();
      break;
    case 8:
      time_rsa_gen(nruns);
      break;
    default:
      printf("Current test ids:n");
      printf("0 for timing entropy sourcen");
      printf("1 for entropy output testn");
      printf("2 for is_composite (Miller-Rabin) testn");
      printf("3 for timing Miller-Rabinn");
      printf("4 for random prime number generator testn");
      printf("5 for timing random prime number generatorn");
      printf("6 for testing rsa key pair generation and use;
writes to keys.asc and check_keys.ascn");
      printf("7 for testing rsa exponentiation (fixed data)n");
      printf("8 for timing rsa key pair generatorn");
  }

  return 0;
}

You are of course invited to write your own tests and run them to your satisfaction. The tests provided are meant as *minimal* tests, not by any means a full testing suite of any sort. Note that option 7 (test_rsa_exp) effectively performs the verification of asciilifeform's signature for his Chapter 6 of FFA (which you are warmly invited to read for that matter). Take it as a little exercise to change there the data so that you check instead my signature of the .vpatch for EuCrypt's chapters or any other RSA signatures you might have.

Some preliminary timings for this implementation of RSA key generation suggest that a key pair can take on average almost one hour to generate (54 minutes averaged time per key pair on a set of 30 generated pairs). This relatively long wait for a key pair is the unavoidable cost of using truly random primes as opposed to pseudo-random ones (since a non-suitable candidate is discarded rather than massaged into primality for instance) and of increasing the number of iterations that Miller-Rabin performs when checking whether a candidate number is indeed prime. Note that the timings here are not directly comparable in any case with the timings previously reported for a different version of the rsa code, due to differences in both the code itself (most significant: fewer Miller-Rabin iterations, fixed exponent) and in the choice of measurement method (most significant: the time reported here is overall execution time that is ~always more than the actual CPU time in any case).

The .vpatch for all of the above and its signature live from now on on my Reference Code Shelf with links here too, for your convenience:

In the next chapter I'll let the RSA folder alone for once (I'll get back to it later) and move on to the chosen hashing function, namely keccak. So for the next chapter, dust your keccak references and have your Ada manuals at the ready, as keccak implementation will be in Ada, no more C for a while. Hooray!


  1. Rivest, Shamir and Adleman, A Method for Obtaining Digital Signatures and Public-Key Cryptosystems 

  2. It took only 4 chapters or otherwise a whole month not to mention the work done prior to that, just to get to actually generate some RSA keys but hey, it could have been worse! By worse I mean I could have taken the "easy" route, naively using GnuPG itself, trusting it because everybody uses it and it is open source and it's been around for so long and it says it does RSA and Miller-Rabin and random numbers and saying it is surely now just as good as doing it only a whole lot easier, isn't it? 

  3. This gives the number of co-primes between 0 and the given number. As it is a multiplicative function, the totient phi(p*q) = phi(p)*phi(q) and since p and q are chosen to be prime numbers, they will each have all smaller positive numbers as their co-primes. Consequently, phi(n) = phi(p*q) = (p-1)*(q-1). 

  4. Note that the user is the one who reads and understands what they are running, as a bare minimum requirement. Most pointedly, a monkey pressing some/any/all keys is not user of anything, it's still a monkey and nothing else. 

  5. These 2 options are the only acceptable ways of dealing with a problem really: ideally you solve it, meaning you address its root cause and from there up; if the ideal option is not chosen because of any less-than-ideal but very real world constraints of any kind, the honest approach is to flag the issue, making it, if at all possible, even more visible than it was, certainly not less visible! 

11 Comments »

  1. This text http://www.dianacoman.com/2018/01/11/eucrypt-chapter-5-c-m-e-mod-n/#selection-457.7-469.449 is a bit misleading. In the primegen function the mpi will always be filled with the specified number of octets, even if all of these are zero. The 2 high bits and 1 low bits are always set, as no normalization happens in mpi_set_buffer or the primegen function, therefor the minimum number will be (2^bitness + 2^(bitness-1) + 1) and the maximum will be ((2^bitness) - 1) where bitness is 2048. In the tests "e" allocated as having the whole length (512 octets) but it will always be half this.

    So although "The search for a suitable e is iterative: if a prime happens to fail the boundary checks (so it’s either too small or too large) then it is discarded and another prime is generated." The e will in fact always fall within the boundaries by definition. Although I understand the code here, such extra and unneeded checks without comments make the interpretation of the code hard.

    Finally, I'm still getting my head around modulus calculus, but would a very large "e" imply a smaller "d"?

    Comment by ave1 — March 25, 2018 @ 3:31 p.m.

  2. ave1 A very large "e" does NOT imply a very small "d" because of how modular algebra works essentially (the main way I think of modular algebra is that one works with a number line that is *folded* to a fixed length if this makes any sense to you). The e and d are *modular* inverses (i.e. d = e^-1 mod phi) so not exactly same properties as usual. At some point I even tested this directly: http://trilema.com/2017/tmsr-rsa-spec-extremely-early-draft/#comment-123474

    Regarding the rest, I can see the point to adding a comment on that check in rsa.c for clarity perhaps (I assume you are talking of while ( (mpi_cmp_ui(sk->e, 3) < 0) || (mpi_cmp(sk->e, phi) > 0)); - correct me if I'm wrong). It is true that gen_keypair in rsa.c asks for a prime of size 2048 that WILL therefore be greater than 3 and less than phi. It is though a check that is part of the RSA algorithm and for this reason I think it should be there at all times - basically I don't think it's a good idea for the code to skip those checks because in the *current setup* they *should* indeed always pass. Anothere way to put this: rsa.c is an implementation of the RSA algorithm and that check is part of the algorithm therefore it should be there because gen_keypair does NOT have any control over the actual implementation of gen_random_prime. And I would really, really hate the situation when a change to gen_random_prime for instance would require for some reasons changes in the implementation of rsa or who knows where else (a bit a la mpi).

    I re-read a couple of times the selected text but I don't quite see how it is you find it misleading: the number generator will set mask 110...1 on whatever *length* it is given and this length is at least 1 octet (i.e. minimum is 2^7+2^6+1= 193 exactly as stated). This is the limit of the prime generator and it has nothing to do with RSA. As the next/last sentence there states: "Nevertheless, this higher limit is imposed by the current generator, not by the RSA algorithm itself and for this reason the check here is more lenient as it aims to reflect requirements of RSA rather than other characteristics of its current surroundings – basically the generator can be changed at a later time without having to touch the RSA code itself." Can you clarify what part here you think is incorrect / not clearly stated?

    Comment by Diana Coman — March 25, 2018 @ 8:48 p.m.

  3. Thank you for the clarification! Modular math keeps tripping me up.

    No, the paragraph is not misleading, I misread and did not read carefully enough.

    Yes, I was talking about this test "( (mpi_cmp_ui(sk->e, 3) e, phi) > 0));" and I'll interpret it as a check of the contract between the RSA key generator and the random prime generator (like in Eiffel). In the same trend, should then d also be checked? (Of course, not the upper boundary but in the same comments thread from trilema, I also see a mention of d needing to be larger then a fraction of the modulus).

    Last, is there a rationale for the current choice of e to have 256 octets / 2048 bits? (I want to save the keypair in a fixed length structure, so I'm going with 512 octets for now)

    Comment by Anonymous — March 26, 2018 @ 8:53 a.m.

  4. The length of e is in principle the user's choice really (they could even choose 65535 if they want to), not a hard requirement of RSA itself. So yes, I'd say it makes sense to go with 512 octets if you want to have fixed-length structure for all choices within TMSR current spec.

    Regarding d - do you mean the n^0.5 given as lower threshold in the linked comment? It's not set/strict either as such, hence no check. In principle d *can* be anything at all - by itself and on its own as far as I know now it isn't a vulnerability as such (and if you explicitly exclude more values you can also argue that you reduce the space that a brute force attack needs to cover, strictly speaking).

    Comment by Diana Coman — March 26, 2018 @ 9:46 p.m.

  5. "Regarding d – do you mean the n^0.5 given as lower threshold in the linked comment?"

    Yes d can be anything at all, it could even be 3 (with an incredibly low chance) which must be a problem because even if an attacker does not know *d*, it is trivial to try low numbers. So, there is a also minimal value of "d" that is related to when brute force becomes a problem (I do not think this is the n^0.5 limit, it will be lower).

    Comment by ave1 — March 27, 2018 @ 5:52 a.m.

  6. Do you have a specific brute force attack in mind or how do you see this happening exactly? Note that you don't *know* which key has a very small d or even if there is one with a small d.

    Comment by Diana Coman — March 27, 2018 @ 2:42 p.m.

  7. Yes, I know that. When you generate a small public e (say e < 2^32), the private d has to be larger than phi / e or else you cannot even get to 1 mod n. But, when you have a large e (say for example larger than 2^(bitness - 32)), you *might* have a small d (very very low changes of this, but still possible). So if you do not check for this in the generation of the e + d pair you *might* now be open to attack. And as an attacker, if you see a very large bitness "e", you can then try up to 2^32 and *might* be lucky.

    Comment by ave1 — March 27, 2018 @ 4:16 p.m.

  8. Yes, I know that. When you generate a small public e (say e < 2^32), the private d has to be larger than phi / e or else you cannot even get to 1 mod n. But, when you have a large e (say for example larger than 2^(bitness - 32)), you *might* have a small d (very very low changes of this, but still possible). So if you do not check for this in the generation of the e + d pair you *might* now be open to attack. And as an attacker, if you see a very large bitness "e", you encode a message with that e plus modulus and then try up to 2^32 values for d and *might* be lucky. (Note that I have no idea on the max feasable number here)

    In the current implementation this is impossible as e will have 2048 bits and d will have more. But a test for this would follow the same reasoning as the test for 3<e<phi, i.e. if the generator for e changes this invariant still has to hold and if not, retry.

    Comment by ave1 — March 27, 2018 @ 4:25 p.m.

  9. Hm, I will give it a bit more thought but at the moment the way I see it is this:

    1. The test 3<e<phi is part of the RSA algorithm i.e. it's a *mathematical* consideration and therefore it comes before any *security* consideration, so it's on a different level. Note that it's not a test of e being greater than 2^n for some n chosen for security considerations. I suppose the argument could be at most that well, then the test should be 3<=e<phi perhaps.

    2. One cannot make an implementation that can be guaranteed to not require changes (potentially scattered too) for *security reasons* if you change parameters/components it relies on. The best one can ask is that the RSA algorithm (mathematics, not security) itself is and remains correct regardless of what else changes.

    Comment by Diana Coman — March 27, 2018 @ 8:10 p.m.

  10. […] you might be aware, cryptography algorithms such as RSA and games such as Eulora rely to a significant extent on randomly generated numbers. As you might […]

    Pingback by EuCrypt Chapter 13: SMG RNG in Ossasepia — May 3, 2018 @ 5:52 p.m.

  11. [...] is simply that RSA is essentially too expensive for that. As it happens, it turns out that republican RSA with its fixed-size 256 octets (2048 bits) public exponent is anyway too expensive even for this [...]

    Pingback by SMG Comms Chapter 10: Actions and RSA Keys « Ossasepia — November 30, 2018 @ 1:58 p.m.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Theme and content by Diana Coman