EuCrypt Chapter 10: OAEP with Keccak a la TMSR



February 15th, 2018 by Diana Coman

~ This is part of the EuCrypt series. Start with Introducing EuCrypt. ~

As I finally have both Keccak at bit-level (reference version) and Keccak at word-level (working horse version because reality bytes), the next step is to implement the TMSR OAEP (optimal asymmetric encryption padding) of a message. The OAEP in there stands indeed for the original approach by Bellare and Rogaway1, while the TMSR in there stands - as usual - for an honest discussion of issues encountered and decisions made as well as a better approach to very pointy and very real problems (not a perfect approach, sure; just a few degrees of magnitude better than what one finds for instance in GnuPG).

Before proceeding to the actual code, note that the "padding" term in OAEP is rather misleading with respect to the goal of the code in this chapter: while some actual padding is involved indeed, the whole process is best thought of as a type of encryption really. So EuCrypt Chapter 10 provides the implementation of an OAEP-based encryption / decryption scheme, a la TMSR, using Keccak as the underlying hash function. Specifically:

  • OAEP Encryption: result will *always* be a block of 4096 bits (512 octets). Each such resulting block can hold at most 1960 bits (245 octets) of the original message. Longer messages will simply have to be split into blocks of 1960 bits and then passed to the OAEP encryption procedure.

    • Step 1: [random octet][size1][size2]['T']['M']['S']['R']['-']['R']['S']['A'][random octet padding]*M

      The original message M is encapsulated in a block of 256 octets that has the following format: [random octet][size1][size2]['T']['M']['S']['R']['-']['R']['S']['A'][random octet padding]*M . Essentially the block starts with a random octet followed by 2 octets that hold the actual length (in bits) of M, followed by 8 reserved octets, followed by as many (or as few, potentially none) random octets as needed to pad M to the maximum size of 245 octets, followed by M itself. Whenever M has already the full length of 245 octets, there will be no padding at all. Whenever M is shorter than 245 octets, the remaining octets up to 245 will contain random bits and will be placed before M.2. The reserved octets are shown as "TMSR-RSA" currently - the values stored there however are not part of the standard so any implementation can store anything else, random bits included.

      This approach to encapsulating the original message M in a clear format that however does *not* use fixed values has significant benefits over the current approach used by GnuPG: first, the format is more flexible and easy to expand than any inband approach; second, the use of random (true random as well, NOT pseudo-random) rather than fixed-value padding fits the purpose: the whole point is to introduce entropy, not to take it away by having fixed values that even end up in predictable, fixed output (just go ahead and use GnuPG to encrypt various messages with various keys, compare the results and you'll see what I mean). And note the improvement here: there are still some fixed bits indeed3 but they are very, very few compared to quite anything else in common use at the moment.

    • Step 2: A block R of 2048 bits is filled with random bits.
    • Step 3: X = M00 xor hash(R)

      A block X of 2048 bits is calculated as X = M00 xor hash(R), where M00 is the block of 2048 bits from step 1, R is the block of 256 octets from step 2 and hash is the Keccak f-1600 permutation (aka the Keccak sponge with an internal state of 1600 bits as implemented in EuCrypt) used as a hash with the default bitrate (currently 1344 bits) and the output length set at 2048 bits.
    • Step 4: Y = R xor hash(X)

      A block Y of 2048 bits is calculated as Y = R xor hash(X), where R, X and hash are as described in previous steps above.
    • Step 5: Result = X || Y

      The OAEP encrypted block of 4096 bits is obtained by concatenating the previous 2 parts: X || Y.
  • OAEP Decryption: the result should be the original message previously encrypted with OAEP. This is effectively the reverse operation:

    • Step 1: Obtain X and Y as the two halves of the input block of 4096 bits (NB: input block HAS TO be precisely 4096 bits long as otherwise it is not a valid OAEP-encrypted block and therefore the decryption can't succeed.)
    • Step 2: R = Y xor hash(X)
    • Step 3: M00 = X xor hash(R)
    • Step 4: length of M is extracted as the value stored in bits 9 to 32 of M00 while M itself is extracted as the corresponding last bits at the end of M00.

The implementation for the above is done in a separate package that lives at the moment in eucrypt/smg_keccak/ (smg_oaep.ads smg_oaep.adb). This place for smg_oaep reflects the fact that the word-level implementation of keccak is the everyday workhorse for Eulora but note that the TMSR OAEP implementation itself is not dependent in any way on a specific Keccak implementation. The decision to implement this in Ada rather than C fits of course the long-term preference for Ada as main programming language for S.MG but it turned out it also fits very, very well my own personal preference (despite the fact that I really have very little experience with Ada and way more with C)4: while I *did* implement the thing in C as well, it took me twice as long and it was 10 times more painful (and it's still less clear to read in any case). Despite those huge advantages of Ada, legacy code in C (such as mpi) means that the smg_keccak library will have to communicate with C code, inevitably. The first steps towards that are included in this chapter as I add a hash function with C-style char* parameters so that C code can directly call Keccak for any hashing needs. This is how it looks like (in eucrypt/smg_keccak/oaep.ads):

  -- wrapper for calling from C
  -- @param Input the input string, as array of characters (C style)
  -- @param LenIn the length of the input string (as number of BITS)
  -- @param LenOut the desired number of bits to be returned as output
  -- @param Block_Len the bitrate used by the Keccak sponge (number of BITS)
  -- @return an array of characters with first LenOut bits set to Keccak output

  -- NB: caller HAS TO provide the length of the Input (parameter LenIn)
  -- NB: caller HAS TO provide the length of the Output (parameter LenOut)
  function Hash( Input     : Interfaces.C.Char_Array;
                 LenIn     : Interfaces.C.size_t;
                 LenOut    : Interfaces.C.size_t;
                 Block_Len : Interfaces.C.int := Default_Bitrate)
                 return Interfaces.C.Char_Array;
  pragma Export( C, Hash, "hash" );

One observation regarding this C-style Hash function: it will generate warnings at compilation due to the fact that Ada uses sane strings while C uses the "null-terminated string" approach5. For once (it's very rare indeed) however I'll live with those warnings: the LenIn and LenOut parameters are there precisely to specify the length and therefore avoid going outside the bounds of the allocated memory for that string.

The smg_oaep.ads file also defines the relevant constants and subtypes for OAEP encoding, the oaep_encrypt and oaep_decrypt procedures, a helper procedure xor_strings, conversion methods from string to bitstream as well as from bitstream to string and a wrapper for the Keccak sponge function so that hash can be called directly with string input and output (the wrapper converts to bitstream and back as relevant). While those conversions have a cost of course, it is unclear at this stage that this cost is indeed problematic for Eulora's needs. Consequently, any potential optimisations here (and in the whole EuCrypt for that matter) are an issue left for a later time. The full content of eucrypt/smg_keccak/smg_oaep.ads:

-- Implementation of TMSR's OAEP with Keccak as hash function
--
-- S.MG, 2018

with SMG_Keccak; use SMG_Keccak; -- Keccak is used as hash function
with Interfaces; use Interfaces; -- for Unsigned_8 type and bit-level ops
with Interfaces.C; use Interfaces.C; -- for interop with C

package SMG_OAEP is
  pragma Pure( SMG_OAEP ); -- stateless, no side effects -> can cache calls

  -- fixed length of OAEP block in bits and in octets
  OAEP_LENGTH_BITS   : constant := 4096;
  OAEP_LENGTH_OCTETS : constant := 512;
  OAEP_HALF_OCTETS   : constant := OAEP_LENGTH_OCTETS / 2;

  -- subtypes used by the OAEP encrypt/decrypt
  subtype OAEP_Block is String( 1 .. OAEP_LENGTH_OCTETS );
  subtype OAEP_HALF is String( 1 .. OAEP_HALF_OCTETS );

  -- padding & formatting of maximum 1960 bits of the given String
  -- uses TMSR's OAEP schema:
  -- 1.format M00 as: [random octet][sz1][sz2]"TMSR-RSA"[random]*Message
  --    where sz1 and sz2 store the length of the message in bits
  --    the random octets before message are padding to make OAEP_LENGTH_OCTETS
  -- 2. R = OAEP_HALF_OCTETS random bits
  -- 3. X = M00 xor hash(R)
  -- 4. Y = R xor hash(X)
  -- 5. Result is X || Y
  -- NB: the Entropy parameter should be random octets from which this method
  -- will use as many as required for the OAEP encryption of given Msg
  -- NB: at MOST OAEP_LENGTH_OCTETS - 11 octets of Msg! (Msg at most 1960 bits)
  procedure OAEP_Encrypt( Msg     : in String;
                          Entropy : in OAEP_Block;
                          Output  : out OAEP_Block);

  -- This is the opposite of OAEP_Encrypt above.
  -- @param Encr - an OAEP block previously obtained from OAEP_Encrypt
  -- @param Len - this will hold the length of the obtained message (in bits!)
  -- @param Output - the first Len octets of this are the recovered message
  -- @param Success - set to TRUE if message was recovered, false otherwise
  -- NB: when Success is FALSE, both Len and Output have undefined values
  procedure OAEP_Decrypt( Encr    : in OAEP_Block;
                          Len     : out Natural;
                          Output  : out OAEP_HALF;
                          Success : out Boolean);

  -- helper method, xor on strings
  -- NB: only Output'Length bits will be considered from S1 and S2
  -- NB: caller is responsible for S1 and S2 being long enough!
  procedure XOR_Strings( S1: in String; S2: in String; Output: out String );

  -- gnat-specific methods for bit-level operations
	function Shift_Right( Value  : Unsigned_8;
                        Amount : Natural )
                        return Unsigned_8;
  pragma Import(Intrinsic, Shift_Right);

	function Shift_Left( Value  : Unsigned_8;
                        Amount : Natural )
                        return Unsigned_8;
  pragma Import(Intrinsic, Shift_Left);

  -- conversions between bitstream and string
  -- NB: caller has to ensure correct size of output parameter! no checks here.
  procedure ToString( B: in Bitstream; S: out String );
  procedure ToBitstream( S: in String; B: out Bitstream );

  -- public wrapper for Sponge to use String for input/output
  procedure HashKeccak( Input     : in String;
                        Output    : out String;
                        Block_Len : in Keccak_Rate := Default_Bitrate);

  -- wrapper for calling from C
  -- @param Input the input string, as array of characters (C style)
  -- @param LenIn the length of the input string (as number of BITS)
  -- @param LenOut the desired number of bits to be returned as output
  -- @param Block_Len the bitrate used by the Keccak sponge (number of BITS)
  -- @return an array of characters with first LenOut bits set to Keccak output

  -- NB: caller HAS TO provide the length of the Input (parameter LenIn)
  -- NB: caller HAS TO provide the length of the Output (parameter LenOut)
  function Hash( Input     : Interfaces.C.Char_Array;
                 LenIn     : Interfaces.C.size_t;
                 LenOut    : Interfaces.C.size_t;
                 Block_Len : Interfaces.C.int := Default_Bitrate)
                 return Interfaces.C.Char_Array;
  pragma Export( C, Hash, "hash" );

end SMG_OAEP;

The corresponding implementation of those methods in eucrypt/smg_keccak/smg_oaep.adb follows closely the steps discussed above (and clearly stated in comments throughout the code as well):

-- S.MG, 2018

package body SMG_OAEP is

  procedure HashKeccak( Input     : in String;
                        Output    : out String;
                        Block_Len : in Keccak_Rate := Default_Bitrate) is
    BIn  : Bitstream( 0 .. Input'Length * 8 - 1 );
    BOut : Bitstream( 0 .. Output'Length * 8 - 1 );
  begin
    ToBitstream( Input, BIn);
    Sponge( BIn, BOut, Block_Len);
    ToString( BOut, Output );
  end HashKeccak;

  function Hash( Input     : Interfaces.C.Char_Array;
                 LenIn     : Interfaces.C.size_t;
                 LenOut    : Interfaces.C.size_t;
                 Block_Len : Interfaces.C.int := Default_Bitrate)
                 return Interfaces.C.Char_Array is
    AdaLenIn  : Natural := Natural(LenIn);
    AdaLenOut : Natural := Natural(LenOut);
    InStr     : String( 0 .. AdaLenIn-1 )  := (others => '0');
    OutStr    : String( 0 .. AdaLenOut-1 ) := (others => '0');
    COut      : Interfaces.C.Char_Array( 0 .. LenOut-1 );
    Count     : Natural := AdaLenOut;
    CCount    : Interfaces.C.size_t := LenOut;
  begin
    Interfaces.C.To_Ada( Input, InStr, AdaLenIn );
    HashKeccak( InStr, OutStr, Keccak_Rate(Block_Len) );
    Interfaces.C.To_C( OutStr, COut, CCount );
    return COut;
  end Hash;

  -- conversion between types
  procedure ToString(B: in Bitstream; S: out String ) is
    N   : Natural;
    Pos : Natural;
  begin
    Pos := B'First;
    for I in S'Range loop
      N := Natural( B( Pos     ) ) +
           Natural( B( Pos + 1 ) ) * 2 +
           Natural( B( Pos + 2 ) ) * 4 +
           Natural( B( Pos + 3 ) ) * 8 +
           Natural( B( Pos + 4 ) ) * 16 +
           Natural( B( Pos + 5 ) ) * 32 +
           Natural( B( Pos + 6 ) ) * 64 +
           Natural( B( Pos + 7 ) ) * 128;
      Pos := Pos + 8;
      S( I ) := Character'Val( N );
    end loop;
  end ToString;

  procedure ToBitstream(S: in String; B: out Bitstream ) is
    V   : Unsigned_8;
    Pos : Natural;
  begin
    Pos := B'First;
    for C of S loop
      V := Character'Pos( C );
      B( Pos     ) := Bit( V and 1 );
      B( Pos + 1 ) := Bit( Shift_Right( V, 1 ) and 1 );
      B( Pos + 2 ) := Bit( Shift_Right( V, 2 ) and 1 );
      B( Pos + 3 ) := Bit( Shift_Right( V, 3 ) and 1 );
      B( Pos + 4 ) := Bit( Shift_Right( V, 4 ) and 1 );
      B( Pos + 5 ) := Bit( Shift_Right( V, 5 ) and 1 );
      B( Pos + 6 ) := Bit( Shift_Right( V, 6 ) and 1 );
      B( Pos + 7 ) := Bit( Shift_Right( V, 7 ) and 1 );

      Pos := Pos + 8;
    end loop;
  end ToBitstream;

  -- padding & formatting of maximum 1960 bits of the given String
  -- uses TMSR's OAEP schema:
  -- 1.format M00 as: [random octet][sz1][sz2]"TMSR-RSA"[random]*Message
  --    where sz1 and sz2 store the length of the message in bits
  --    the random octets before message are padding to make OAEP_LENGTH_OCTETS
  -- 2. R = OAEP_HALF_OCTETS random bits
  -- 3. X = M00 xor hash(R)
  -- 4. Y = R xor hash(X)
  -- 5. Result is X || Y
  -- NB: the Entropy parameter should be random octets from which this method
  -- will use as many as required for the OAEP encryption of given Msg
  -- NB: at MOST OAEP_LENGTH_OCTETS - 11 octets of Msg! (Msg at most 1960 bits)
  procedure OAEP_Encrypt( Msg     : in String;
                          Entropy : in OAEP_Block;
                          Output  : out OAEP_Block) is
    M00    : OAEP_HALF;
    R      : OAEP_HALF;
    HashR  : OAEP_HALF;
    X      : OAEP_HALF;
    HashX  : OAEP_HALF;
    Y      : OAEP_HALF;
    MsgLen : Natural;
    MaxLen : Natural;
    PadLen : Natural;
    TMSR   : constant String := "TMSR-RSA";
  begin
    -- calculate maximum length of msg and needed amount of padding
    -- make sure also that only MaxLen octets at most are used from Msg
    MaxLen := OAEP_HALF_OCTETS - TMSR'Length - 3;  -- maximum msg that fits
    MsgLen := Msg'Length;                          -- real msg length
    if MsgLen > MaxLen then
      MsgLen := MaxLen;  --only first MaxLen octets will be considered
      PadLen := 0;       --no padding needed
    else
      PadLen := MaxLen - MsgLen; -- msg is potentially too short, add padding
    end if;

    -- step 1: header and format to obtain M00
      -- first octet is random bits
    M00( M00'First ) := Entropy( Entropy'First );

      -- next 2 octets hold the used length of Msg (number of octets)
    M00( M00'First + 2) := Character'Val( ( MsgLen * 8 ) mod 255 );
    M00( M00'First + 1) := Character'Val( ( (MsgLen * 8 ) / 255 ) mod 255 );

      -- next 8 octets are reserved for later use, currently "TMSR-RSA"
    M00( M00'First + 3 .. M00'First + 10 ) := TMSR;

      -- random bits for padding, if Msg is less than 245 octets
    for I in 1 .. PadLen loop
      M00( M00'First + 10 + I ) := Entropy( Entropy'First + I );
    end loop;

      -- the message itself
    M00( M00'Last - MsgLen + 1 .. M00'Last ) :=
                               Msg( Msg'First .. Msg'First + MsgLen - 1 );

    -- step 2: R = OAEP_HALF_OCTETS random octets
    -- can take LAST octets from given entropy as they are NOT used before
    -- (even if original message was empty, padding uses at most half - 10
    --   while entropy has full block length)
    R := Entropy( Entropy'Last - OAEP_HALF_OCTETS + 1 .. Entropy'Last );

    -- step 3: X = M00 xor hash(R)
    HashKeccak( R, HashR );
    XOR_Strings( M00, HashR, X );

    -- step 4: Y = R xor hash(X)
    HashKeccak( X, HashX );
    XOR_Strings( R, HashX, Y );

    -- step 5: Output is X || Y
    Output( Output'First .. Output'First + X'Length - 1 ) := X;
    Output( Output'Last - Y'Length + 1 .. Output'Last )   := Y;

  end OAEP_Encrypt;

  procedure OAEP_Decrypt( Encr    : in OAEP_Block;
                          Len     : out Natural;
                          Output  : out OAEP_HALF;
                          Success : out Boolean ) is
    X, Y, M, R   : OAEP_HALF;
    HashX, HashR : OAEP_HALF;
    MaxLen       : constant Natural := OAEP_LENGTH_OCTETS - 11;
    LenOctets    : Natural;
  begin
    -- step 1: separate X and Y
    X := Encr( Encr'First .. Encr'First + X'Length - 1 );
    Y := Encr( Encr'Last - Y'Length + 1 .. Encr'Last );

    -- step 2: R = Y xor hash(X)
    HashKeccak( X, HashX );
    XOR_Strings( Y, HashX, R );

    -- step 3: M = X xor hash(R)
    HashKeccak( R, HashR );
    XOR_Strings( X, HashR, M );

    -- step 4: extract length and message
    Len := Character'Pos( M( M'First + 1 ) ) * 255 +
           Character'Pos( M( M'First + 2 ) );
    LenOctets := Len / 8;

    if LenOctets > MaxLen or LenOctets < 0 then
      Success := False;  -- error, failed to retrieve message
    else
      Success := True;
      Output( Output'First .. Output'First + LenOctets - 1 ) :=
        M( M'Last - LenOctets + 1 .. M'Last );
    end if;

  end OAEP_Decrypt;

  -- helper method, xor on strings
  -- NB: only Output'Length bits will be considered from S1 and S2
  -- NB: caller is responsible for S1 and S2 being long enough!
  procedure XOR_Strings( S1: in String; S2: in String; Output: out String ) is
    V1, V2: Unsigned_8;
  begin
    for I in Output'Range loop
      V1 := Character'Pos( S1( I ) );
      V2 := Character'Pos( S2( I ) );
      Output( I ) := Character'Val( V1 xor V2 );
    end loop;
  end XOR_Strings;
end SMG_OAEP;

You might have noticed in the above that the OAEP_Encrypt method does not directly access a source of random bits. Instead, it simply relies on the caller to provide it with whatever random bits they want used (through the parameter Entropy). The main reason for this is the fact that access to a source of entropy is not in itself an OAEP concern and there is no reason for mixing it in here. Since the format used may require at the very most 4080 random bits6, the size of the Entropy parameter (4096) covers all situations and allows the OAEP method to simply use the random bits as they are given and to use each bit only once (if it is used). Essentially the responsibility for good entropy rests with the caller of OAEP, as it should be: it's your tool, you can use it poorly to poor results, certainly.

The helper method XOR_Strings sticks out a bit to me like a sore thumb. On one hand, Ada correctly refuses to treat characters (hence, strings) directly as numbers. On the other hand however the task at hand does exactly that, no matter how much sugar-coating one may put on it: the whole encrypting/hashing/decrypting/padding goes from characters to their numerical representation and back. Consequently, at this stage at least, I made myself the tool I needed and that is that. Once it exists, it can be changed or abandoned later, such is life, but more importantly it can actually be... used first.

You might have noticed in the above the call to the Keccak sponge being slightly different (parameter order) than in the previous chapter. This is indeed the case and the reason for it is the fact that I've introduced a default bitrate for the sponge so that the caller can rely on this if they don't specify a bitrate. The code with this small change in eucrypt/smg_keccak/smg_keccak.ads:

  Default_Bitrate: constant := 1344; --max bits the sponge can eat/spit without
                                     --needing to scramble the state

...and the new signature for the Sponge:

  -- public function, the sponge itself
  -- Keccak sponge structure using Keccak_Function, Pad and a given bitrate;
  -- Input - the stream of bits to hash (the message)
  -- Output - a bitstream of desired size for holding output
  -- Block_Len - the bitrate to use; this is effectively the block length
  --             for splitting Input AND squeezing output between scrambles
  procedure Sponge(Input      : in Bitstream;
                   Output     : out Bitstream;
                   Block_Len  : in Keccak_Rate := Default_Bitrate );

As usual, there are also tests for all the new methods (in eucrypt/smg_keccak/tests/smg_keccak-test.adb):

  procedure test_bitstream_conversion is
    S: String := "Aa*/";
    E: Bitstream( 0 .. 31 ) := (1, 0, 0, 0, 0, 0, 1, 0,
                                1, 0, 0, 0, 0, 1, 1, 0,
                                0, 1, 0, 1, 0, 1, 0, 0,
                                1, 1, 1, 1, 0, 1, 0, 0);
    B: Bitstream( 0 .. 31 );
    SS: String := "  t ";
  begin
    Put_Line("---Testing string to bitstream conversion---");
    ToBitstream( S, B );
    if E /= B then
      Put_Line("FAILED: string to bitstream conversion.");
    else
      Put_Line("PASSED: string to bitstream conversion.");
    end if;

    Put_Line("---Testing bitstream to string conversion---");
    ToString( B, SS );
    if SS /= S then
      Put_Line("FAILED: bitstream to string conversion");
      Put_Line("EXPECTED: " & S);
      Put_Line("OUTPUT: " & SS);
    else
      Put_Line("PASSED: bitstream to string conversion");
    end if;
  end test_bitstream_conversion;

  procedure test_hash_keccak is
    S: String := "X";
    O: String := "abc";
    B: Bitstream( 0 .. 23 );
    BB: Bitstream( 1.. 8):= (0, 0, 0, 1, 1, 0, 1, 0);
    Exp: Bitstream( 0 .. 23 ) := (1, 1, 1, 0, 0, 0, 0, 1,
                                  0, 1, 1, 0, 0, 0, 1, 0,
                                  1, 1, 1, 0, 0, 0, 1, 1);
  begin
    Put_Line("----Testing hash keccak on string " & S & "----");
    HashKeccak(S, O);
    Put_Line("OUTPUT: " & O);
    ToBitstream( O, B );
    if B /= Exp then
      Put_Line("FAILED: testing hash keccak on string");
      Put_Line("Output:");
      for I of B loop
        Put( Bit'Image( I ) );
      end loop;
      new_line(1);
      Put_Line("Expected:");
      for I of Exp loop
        Put( Bit'Image( I ) );
      end loop;
    else
      Put_Line("PASSED: testing hash keccak on string");
    end if;
    new_line(1);
  end test_hash_keccak;

  procedure test_xor_strings is
    S1     : String := "ABC";
    S2     : String := "CBA";
    Exp    : String := "...";
    Result : String := "...";
  begin
    Exp( Exp'First     ) := Character'Val( 2 );
    Exp( Exp'First + 1 ) := Character'Val( 0 );
    Exp( Exp'First + 2 ) := Character'Val( 2 );

    Put_Line("----Testing xor on strings---");
    XOR_Strings( S1, S2, Result);
    Put_Line("S1 is " & S1);
    Put_Line("S2 is " & S2);
    Put_Line("S1 xor S2 is " & Result);
    Put_Line("Result is: ");
    for C of Result loop
      Put( Natural'Image( Character'Pos( C ) ) );
    end loop;
    new_line(1);

    if Result /= Exp then
      Put_Line("FAILED: xor on strings");
    else
      Put_Line("PASSED: xor on strings");
    end if;
  end test_xor_strings;

  procedure test_oaep is
    Msg     : String := "abcdefghij jihgfedcba123456789";
    Encr    : OAEP_Block := ( others => ' ' );
    Decr    : OAEP_HALF  := ( others => ' ' );
    Entropy : OAEP_Block := ( others => 'e' );
    Len     : Natural;
    Flag    : Boolean;
  begin
    Put_Line("----Testing OAEP Encrypt----");
    OAEP_Encrypt( Msg, Entropy, Encr );

    Put_Line("----Testing OAEP Decrypt----");
    OAEP_Decrypt( Encr, Len, Decr, Flag );

    Put_Line("Msg is: "  & Msg);
    Put_Line("Encr is: " & Encr);
    Put_Line("Decr is: " & Decr);
    Put_Line("Flag is: " & Boolean'Image( Flag ) );
    Put_Line("Len is: "  & Natural'Image( Len ) );

    if Flag = False or
       Len /= Msg'Length * 8 or
       Decr( Decr'First .. Decr'First + Msg'Length - 1 ) /= Msg
       then
      Put_Line("FAILED: oaep test");
    else
      Put_Line("PASSED: oaep test");
    end if;

  end test_oaep;

Corresponding calls from the main function in eucrypt/smg_keccak/smg_keccak-test.adb:


  -- test bitstream conversion
  test_bitstream_conversion;

  -- test hash keccak (strings version)
  test_hash_keccak;

  -- test oaep encrypt + decrypt
  test_oaep;

  -- test xor on strings
  test_xor_strings;

The .vpatch for this chapter can be found on my Reference Code Shelf and is linked here too, for your convenience. UPDATE 1 October 2018 - added patch to fix the error of using 255 instead of 256 in the oaep part7.:


  1. Bellare, M. and Rogaway, P., 1994. "Optimal Asymmetric Encryption - How to Encrypt with RSA", in Advances in Cryptology - Eurocrypt 94 Proceedings, Lecture Notes in Computer Science Vol. 950, A. De Santis ed., Springer-Verlag. 

  2. The reason for this, directly from S.MG boardroom discussions: "mircea_popescu: motivul are de-a face cu modelele de interpolare si cu faptul ca e in principiu mai simplu sa ghicesti msb decat lsb." 

  3. The size of the message is stored on 16 bits but given that the maximum message size is actually 1960 bits, it follows that the first 5 bits will be for now always 0 (1960 is 111 1010 1000 in binary, so it needs only 11 bits out of the 16 bits available). It can also be argued that the "TMSR-RSA" octets are further fixed values but note that the hard standard is simply on reserving those octets, not on the values that are stored in them per se. In other words, you are free to store random bits in there as well and all will be fine. Overall there are basically 5 bits with fixed values in this schema, compared to 24 bits with fixed values + a lot more pseudo-random ones in other existing schema such as 0x00||0x02||pseudo-random-non-zero-octets||0x00||M  

  4. I guess in this code for OAEP my inexperience with Ada also shows more than usual, as everything gets more complex. For that matter I'm quite sure that it *could* be done better, more elegantly and so on but for all this "could" look that it hasn't been and so I'm stuck doing it now. So if you see something horrible, then go and implement your full version of EuCrypt, sign it, explain it and only then come and say anything to me about how ugly mine is. 

  5. I could have avoided this situation by using Strings of variable length in Ada for instance but I don't see the real benefits of that, while I can certainly see a few downsides that I don't like at all. 

  6. This extreme case of needing 4080 random bits occurs only when the message is empty and the 64 reserved bits are filled with random bits rather than "TMSR-RSA", there are only 16 non-random bits, namely the ones storing the length of the message. 

  7. Thanks PeterL for spotting the error! Note that the encrypt/decrypt still worked but 1. only for as long as one used the Eucrypt code for both (i.e. symmetrical mistake) and 2. it does reduce the length that can be stored (although this is not in itself a problem since anyway the maximum length as per TMSR OAEP would still fit)  

Comments feed: RSS 2.0

20 Responses to “EuCrypt Chapter 10: OAEP with Keccak a la TMSR”

  1. PeterL says:

    Maybe I am missing something here?

    As I understand it, to use RSA the message must be smaller than the modulus. During key generation you set the highest bits of the key to 1, but you can still end up with, e.g. my key, a modulus that starts 1100...

    During oaep the highest bits are the result of xor'ing a random byte with the result of a keccack hash, which would give something like a random distribution. So I should expect the message to be larger than my key something like 3/16th of the time. Does this get checked somewhere, or am I missing that the top bit is somehow set to 0 somewhere?

  2. PeterL says:

    Great, that was just what I was looking for. Thanks!

  3. esthlos says:

    Reality bytes, hah! I felt this pain writing my own version too...

  4. Diana Coman says:

    Oh hey esthlos, how's your Keccak coming up?

  5. PeterL says:

    Is there a reason that in the OAEP message size calculation are you using divide and mod by 255 instead of 256?

  6. Diana Coman says:

    Huh, it should be 256, good catch.

  7. Diana Coman says:

    And updated with fixing patch. Thank PeterL for spotting and pointing out the error!

  8. PeterL says:

    "diana_coman: PeterL - was waiting on your patch for the 255 instead of 256 error on keccak but since it didn't come, I patched it"

    Somehow, it didn't even cross my mind that I could write the patch myself. I just figured you would include it in whatever the next update was going to be. But now that I think about it, it does make sense for this to be a patch by itself rather than part of a larger thing, because it is better to have small, focused patches that are easy to read.

  9. Diana Coman says:

    Certainly. For one thing a fix is "one thing" in itself and for the other there's no reason to wait for next update (not to mention that EuCrypt isn't really schedulled to have updates at the moment - it gets them only if there's some problem found or perhaps some new requirement).

  10. The original message M is encapsulated in a block of 256 octets that has the following format: [random octet][size1][size2]['T']['M']['S']['R']['-']['R']['S']['A'][random octet padding]*M . Essentially the block starts with a random octet followed by 2 octets that hold the actual length (in bits) of M, followed by 8 reserved octets, followed by as many (or as few, potentially none) random octets as needed to pad M to the maximum size of 245 octets, followed by M itself.

    Years later I am no longer quite so happy with this ; if the format instead became [size]M (with size = size of M in octets) then the maximal M would be 255 instead of 245 ; at the cost of losing the ability to move odd bit sized messages. Is this even a thing in actual practice, do we ever specifically want 19 as opposed to 24 bits be moved through Eulora anything ?

  11. Diana Coman says:

    I went again through all the discussions that led to the format described here (there are the s.mg board ones and, otherwise, after a failed attempt to engage the forum on this, there is the #trilema announcement as to chosen format) as well as the code implementing this and this is what I've got:
    1. I don't quite see any clear case for the bit size as opposed to byte size and so I'd go with the proposed change to use a single octet and give the size of the message in bytes rather than in bits. (It seems to me there was a focus on bit as the actual unit but in practice I'd say that's not really the case ever, it's always byte and otherwise in crammed situations -network transport, mainly- one abuses a byte to carry different things but even so, nobody can actually send 7 bits without sending the 8th, as it were. So yeah, theoretically one can say that there are 13 bits of message and even cram some random bits in that 2nd byte but I fail to see any case where this really has to be an oaep concern somehow.
    2. I would however keep at least one random octet in front - there was a point to that and I don't see anything invalidating it now: namely to reduce the structure of the whole thing even when/if someone sends otherwise highly structured/similar messages.
    3. The TMSR-RSA octets were mainly "reserved" rather than fixed to anything. Perhaps it makes sense to still keep this option of "x reserved octets" in the format but otherwise if it doesn't make sense anymore, I'd say that there is indeed no problem at all to just expand the space available to M - basically M gains thus 8 octets more from here (+1 from size, it brings it to 254 , if indeed not to the full 255 meaning therefore that there is 1 bit in that size octet that is known to be 0, true enough).

  12. I re-read myself and yeah, the first random byte's not a bad idea. The rest I'd rather not burn -- every reserved byte is a cost ; it's true that not all messages are fully loaded ; but in many cases they indeed are, and why waste 5% of anything ? In 2015 it might've looked like it's a reasonable tax to pay to the republic as then imagined ; but it's outright ludicrous to pay a farthing to what actually came out.

    I think we'll live with the structure implicit in the 2nd bit of the 2nd octet being known to be 0 slightly more often than randomly ; so let's make it [R][C][M1]...[M254] and call it good. I am guessing we even get one extra serpent key from this per packet.

  13. Diana Coman says:

    Sounds good. Looking now in the Eulora code, the RSA key in Eulora is 490 rather than the 512 assumed here, so the OAEP block length changes accordingly.

  14. […] I was (re)reading today through my own OAEP with Keccak code from the previous chapter, I found an error1. And while I take all the blame for it being there in […]

  15. […] surely be smaller than n. And since the highest bit of the resulting oaep block is quite random (TMSR OAEP uses *by design* a significant number of *true* random bits), it follows that it’s really […]

  16. […] clarify the issue here: currently Keccak is part of EuCrypt mainly because at the time of its implementation there simply was no better place for it. But […]

  17. [...] implementation? ↩One of the earliest attempts with char_array is published here. ↩On a side note and just like with a lot of other things, doing this is an exercise in [...]

  18. [...] on, I'll implement V for production use as well, with everything adapted to purpose, from cryptography to diff-ing, pressing and everything in between. There's neither need nor any bit of desire to rely [...]

Leave a Reply to PeterL