SMG Comms Chapter 2: Raw Types



October 16th, 2018 by Diana Coman

~ This is a work in progress towards an Ada implementation of Eulora's communication protocol. Start with Chapter 1.~

My previous Chapter 1 with its assorted warts and lamentations sparked a rather productive discussion in the forum that ended up with a significant revision of the protocol specification. The main change has to do with packet sizes and types: ALL packets will be either 1470 octets long and RSA encrypted OR 1472 octets long and Serpent encrypted. Those sizes are chosen to ensure that there is no fragmenting of UDP frames sent on the wire since such fragmenting increases significantly the chance of packet loss. A test with those UDP packet sizes sent back and forth between UK and UY did not uncover any trouble - if anything, the results were actually better than expected. In any case, increased chances of reliable communications aside, the added benefit of those fixed, unique sizes is a clearer protocol alltogether. So after a bit of digesting the new specification, I can finally say that I have some sort of plan for its implementation rather than grasping at it from any thread that I could perceive at all (as it was pretty much the case in Chapter 1). Specifically, my current implementation plan is made of 3 layers that build on one another and give me a structure to rely on:

  1. Layer 0 - Raw types
  2. Layer 1 - Sender/Receiver
  3. Layer 2 - Data structures

1. Layer 0 - Raw types

The raw types for Eulora's communication protocol are essentially two: octet (the basic unit, effectively 8 bits) and arrays of octets (various fixed sizes matching the basic types in the protocol's specification). However, for the arrays of octets, there is a useful distinction to be made between packets (i.e. the received array of octets of fixed length, either 1470 or 1472 octets) and messages (i.e. the content of a packet). Essentially messages are obtained from packets in Eulora's communication protocol through decryption using either Serpent (for 1472 octets long packets) or RSA (for 1470 octets long packet). Symmetrically, packets are obtained by encrypting messages with either Serpent (for 1472 octets long messages) or RSA (for 234 octets long messages). Consequently, this layer 0 offers the definitions of types that cover the full set of possible messages and packets together with the relevant conversion methods between packets and messages as well as between raw octets and int/unsigned/float data types. Note that a "message" here is simply the decrypted content of a packet, nothing more. The interpretation of the octets in a message and/or their use to fill some data structure that the protocol defines happens at a higher layer, not here.

2. Layer 1 - Sender/Receiver

Layer 1 is concerned with sending and receiving packets using the types and conversion methods provided by layer 0 together with the UDP lib for actual network communication. So far this seems to me at the moment a very thin layer really since it's little more than a basic receiver and sender: the receiver simply uses UDP lib to endlessly read all incoming packets and store them in a queue for later processing by consumer(s) external to this layer; the sender may have perhaps an input queue from which it keeps reading packets together with their intended destination and then sends them along using the UDP lib. This layer comes right after the raw types because it needs and uses those raw types but at the moment it's unclear to me whether it belongs as part of a protocol implementation since I can see the case for it being left to the user application of the protocol. At any rate, the protocol specification is not really concerned with this part and this description of the sender/receiver is not a given - it may change as required by each application.

4. Layer 2 - Data Structures

This layer is on top of Layer 0 - it provides the data structures for the different types of messages in Eulora's communication protocol. While any message in Eulora's protocol is just an array of octets at layer 0, the information contained in any message has to be structured according to the protocol's rules described in sections 4, 5 and 7 of the specification. This layer provides the definitions of those data structures as well as the conversion methods from messages to data structures and back.

Note that those 3 layers do not include protocol mechanics since those are really part of the server or client applications. Perhaps a demo / test application implementing the specified protocol mechanics would be useful to include as part of this implementation but I'd rather not specify it as such at this point. For now I'm fine with those 3 layers but if you see any holes in them or problems with them, let me know in the comments below!

Implementation

To start it all off, here's the first part of layer 0 - raw types, including the definitions of types and the most basic conversions via Ada.Unchecked_Conversion:

 -- raw types for the communication protocol
 -- these are used throughout at the lowest level of the protocol
 -- essentially they are the units of packets and of messages
 -- SMG.Comms has only 2 types of packets: RSA and Serpent
 -- a message is the decrypted content of a packet
 -- S.MG, 2018

with Interfaces; use Interfaces; -- Unsigned_n and Integer_n
with Ada.Unchecked_Conversion;

package Raw_Types is

  -- constants from SMG.COMMS standard specification
    -- size of a serpent-encrypted packet and message, in octets
    -- note that this corresponds to 1472/16 = 92 Serpent blocks
    -- NB: lengths are the same but the distinction makes the code clearer
  SERPENT_PKT_OCTETS : constant Positive := 1472;
  SERPENT_MSG_OCTETS : constant Positive := SERPENT_PKT_OCTETS;

    -- size of a RSA-encrypted packet and message in octets and bits
  RSA_PKT_OCTETS     : constant Positive := 1470;
  RSA_MSG_OCTETS     : constant Positive := 234;
  RSA_MSG_BITS       : constant Positive := RSA_MSG_OCTETS * 8; --1872

  -- raw, low-level types
  -- all messages and packets are simply arrays of octets at low level/raw
  type Octets is array( Natural range <> ) of Interfaces.Unsigned_8;

  -- raw representations of basic types (with fixed, well-defined sizes)
  subtype Octets_1 is Octets( 1 .. 1 );
  subtype Octets_2 is Octets( 1 .. 2 );
  subtype Octets_4 is Octets( 1 .. 4 );
  subtype Octets_8 is Octets( 1 .. 8 );

  -- RSA packets and contained raw messages
  subtype RSA_Pkt is Octets( 1 .. RSA_PKT_OCTETS );
  subtype RSA_Msg is Octets( 1 .. RSA_MSG_OCTETS );

  -- Serpent packets and contained raw messages
  -- NB: length is the same but the distinction makes the code clearer
  subtype Serpent_Pkt is Octets( 1 .. SERPENT_PKT_OCTETS );
  subtype Serpent_Msg is Octets( 1 .. SERPENT_MSG_OCTETS );

  -- blind, unchecked casts ( memcpy style )
  function Cast is new Ada.Unchecked_Conversion( Integer_8  , Octets_1 );
  function Cast is new Ada.Unchecked_Conversion( Octets_1   , Integer_8 );
  function Cast is new Ada.Unchecked_Conversion( Unsigned_8 , Octets_1 );
  function Cast is new Ada.Unchecked_Conversion( Octets_1   , Unsigned_8 );

  function Cast is new Ada.Unchecked_Conversion( Integer_16 , Octets_2 );
  function Cast is new Ada.Unchecked_Conversion( Octets_2   , Integer_16 );
  function Cast is new Ada.Unchecked_Conversion( Unsigned_16, Octets_2 );
  function Cast is new Ada.Unchecked_Conversion( Octets_2   , Unsigned_16 );

  function Cast is new Ada.Unchecked_Conversion( Integer_32 , Octets_4 );
  function Cast is new Ada.Unchecked_Conversion( Octets_4   , Integer_32 );
  function Cast is new Ada.Unchecked_Conversion( Unsigned_32, Octets_4 );
  function Cast is new Ada.Unchecked_Conversion( Octets_4   , Unsigned_32 );

  -- Gnat's Float has 32 bits but this might be different with other compilers
  function Cast is new Ada.Unchecked_Conversion( Float, Octets_4 );
  function Cast is new Ada.Unchecked_Conversion( Octets_4, Float );

  function Cast is new Ada.Unchecked_Conversion( Integer_64, Octets_8 );
  function Cast is new Ada.Unchecked_Conversion( Octets_8, Integer_64 );
  function Cast is new Ada.Unchecked_Conversion( Unsigned_64, Octets_8 );
  function Cast is new Ada.Unchecked_Conversion( Octets_8, Unsigned_64 );

end Raw_Types;

I packed the above code as a .vpatch on top of the previous genesis .vpatch mainly for keeping my original promise of showing this being built as it is with detours and corrections on the way. Otherwise this could equally well be a genesis in itself given how radically it changes pretty much everything in there. Nevertheless, at least for now - meaning until a final version is achieved and a regrind makes perhaps sense - I'll keep growing the smg_comms tree. I'll link all smg_comms .vpatches and my signatures for them from the Code Shelf page as usual, while linking the current ones here as well:

The next chapter will add to the raw types layer, providing conversions from packets to messages and back. That will bring in parts of EuCrypt of course, namely the Serpent and RSA implementations. For this reason, I aim to do this perhaps in two steps (first Serpent and only then RSA) in order to keep patches as clear and easy to follow as possible. And this is of course a very good opportunity to re-read the EuCrypt implementation, to re-write it on one level or another as I bring it over to SMG_Comms and to see first-hand how it is to use it!

Comments feed: RSS 2.0

4 Responses to “SMG Comms Chapter 2: Raw Types”

  1. You do not need that pile of C-style casts.

    Ada record actually does The Right Thing when you serialize it, and you can get precise bit-field and endianism control, see sect. 25-1 in Barnes.

    As I understand you need strictly four converters: whole serpent packet to/from octet array, and whole rsa packet to/from octet array.

    There is no good reason to manually serialize the subfields in the data structures, as it seems you were planning to do; it will not make for a more maintainable or clearer proggy.

  2. Diana Coman says:

    That is a very good point. My trouble with serializing the data structures (defined as records) directly is the fact that quite a few of them are effectively parametrized (with the parameter a field itself) - i.e. how many and of what depends on one or several of the fields. Basically the only "sure" thing is that with all the padding it ends up of serpent/rsa length but it's unclear upfront how much padding and what all the fields are. For this reason I left in here at this stage those casts - if I manage to *not* need them later on, I'll certainly take them out. Perhaps I'm just not good enough with Ada yet to see clearly the way to do this without needing any manual casts.

    "As I understand you need strictly four converters: whole serpent packet to/from octet array, and whole rsa packet to/from octet array." - yes, this is certain.

  3. Diana Coman says:

    Re converters, to clarify: four converters indeed, from serpent packet to/from serpent message and from rsa packet to/from rsa message. Note that BOTH packets and messages are effectively octet arrays, there is no higher level concern at this stage (neither needed nore desirable really, it doesn't care what those octets mean, not here).

    The packet to message converter is effectively a decrypt, while the message to packet is an encrypt, that's all.

  4. [...] brings together RSA and OAEP as well as the raw representation as octets of all the things involved: true random padding, RSA key components, plain messages, encrypted [...]

Leave a Reply to Diana Coman