~ This is a work in progress towards an Ada implementation of Eulora's communication protocol. Start with Chapter 1.~
One significant goal of Eulora's communication protocol is to allow the client to request and obtain from the server absolutely *any* file that it might be missing so that there is effectively no need for an "update" of the client anymore. Two of the protocol's current messages target precisely this goal: File Transfer and File Request. The File Transfer message carries chunks of a file while the File Request carries a request from the client for files specified by their names (separated with ';'). And that means of course that one needs to do at least some sort of parsing to extract file names from that list while also handling at any time a set of unknown size containing strings of unknown length. In short: a mess.
My initial stab at the previously-mentioned mess was a rather ugly parametrized record (number of filenames) containing an array of ...parametrized records (each filename having its own length as a parameter). As you might imagine, this is not exactly the sort of thing that "fits in head" - at least not without a bigger hammer. And unsurprisingly, working with such a structure ended up in errors and trouble quite quickly (not to mention it felt about as pleasant as stabbing one's toe at every step). So I threw that first attempt to the bin and decided to store instead the filenames as one single long string (i.e. glued together into one string), alongside an array containing the starting positions of each separate filename in this long string. Basically instead of having separators in the string itself and mixed with the content, there is the content in one place and then neatly to the side the way to access directly any desired filename from the set. This keeps the protocol formatting (separators and the like) here in smg.comms where it belongs rather than pushing it higher up and at the same time it reduces the number of parameters to precisely 2: the number of filenames in there (hence the length of the array of start positions) and the total length of the resulting string (i.e. all filenames lengths added up together). The result is certainly an improvement over the first attempt but I can't say I am terribly fond of it as it is so if you have a better solution to this, go ahead and describe it in the comments below, including why and how it really is the better option - I'll gladly read it.
Using this approach of single string + start positions solved the issue of multiple variable lengths for strings. However, the File Request message still has all sorts of potential troublesome cases, including the case when the given set of filenames ends up longer than one message can carry (at write time) or there are multiple consecutive separators (at read time). My overall approach for such troubles with messages is to check whenever possible and effectively reject a message as invalid if a check fails. Anyway, working with an unknown bunch of unknown strings still remains at all times a sort of ugly spot, no matter what, so those read/write methods still look to me hairier than others so far and there isn't at the moment something specific that I can see to drastically improve them (short of fixing the size of filenames perhaps but I think that's more likely to just push the issue somewhere else as filenames will get padded/trimmed to size). Anyway, the new data structures for this chapter are defined in data_structs.ads:
-- length of a text field (i.e. 16 bits, strictly > 0) subtype Text_Len is Positive range 1..2**16-1; -- A set of file names glued into a single string -- NB: there IS at least ONE filename and one character -- Upper limit for Text_Len is due to protocol's spec of text basic type type U16_Array is array (Text_Len range <> ) of Interfaces.Unsigned_16; type Filenames( F_No: Text_Len := 1; Sz: Text_Len := 1 ) is record -- filenames glued together into 1 single string S : String( 1 .. Sz ) := (others => '0'); -- indices in S, at which each filename starts Starts: U16_Array( 1 .. F_No ) := (others => 1); end record; -- A chunk of a file (for file transfer) type File_Chunk( Len : Text_Len := 1; Count : Interfaces.Unsigned_16 := 1; Name_Len: Text_len := 1) is record Filename: String(1..Name_Len); Content : Raw_Types.Octets(1..Len); end record;
In addition to the data structures above, this chapter adds the following:
- Read/Write from/to Serpent Message for the File_Chunk structure (i.e. File Transfer message).
- Read/Write from/to Serpent Message for the Filenames structure (i.e. File Request message).
- Conversion methods from String to Octets and back. Those are private methods in the Messages packages and are meant for internal use only since messages use raw octets, while filenames and the like are meant as text/strings.
- Read/Write from/to Octets for a 16 bits unsigned value. Similar to the conversion methods above, these are private methods in the Messages package and meant for internal use only. The reason for them to exist in the first place is that 16 bits values are relatively frequent in the protocol (counters and sizes) and their read/write requires an additional step to address potential endianness issues. Since this step is always the same and otherwise easy to forget + hard to debug if/when forgotten, it makes much more sense to have it packed together in one single procedure that can be called wherever needed.
- Refactoring to replace all read/writes of 16-bit values by the new Read_U16/Write_U16 methods.
- Tests for read/write of File Request and File Transfer messages.
- Tests for converters from/to string/octets.
- Small change to the test of pack/unpack RSA messages to ensure that "mangled" message is indeed always different from a valid package1.
The new read/write methods for File Transfer and File Request, described in messages.ads:
----------------- File Transfer ---------------------- -- Writes the given File Chunk to a File Transfer type of message -- Chunk - chunk of file to write; contains counter, filename, content procedure Write_File_Transfer( Chunk : in File_Chunk; Msg : out Raw_Types.Serpent_Msg); -- The opposite of Write_File_Transfer method above. -- Chunk will contain the counter, filename and content procedure Read_File_Transfer( Msg : in Raw_Types.Serpent_Msg; Chunk : out File_Chunk); ----------------- File Request ---------------------- -- Writes a message to request the files specified through their names -- Written parameter will hold the number of filenames actually written -- NB: this can be less than the number of filenames provided! -- When Written < FR.F_No, the FIRST Written filenames were written; the rest -- did not fit into the message and it's up to caller to decide what to do. procedure Write_File_Request( FR : in Filenames; Counter : in Interfaces.Unsigned_16; Msg : out Raw_Types.Serpent_Msg; Written : out Natural); -- Reads a request for files; the opposite of Write_File_Request above -- Raises Invalid_Msg exception if the provided message fails checks. procedure Read_File_Request( Msg : in Raw_Types.Serpent_Msg; Counter : out Interfaces.Unsigned_16; FR : out Filenames);
The implementation of those methods in messages.adb:
------ File Transfer ------ procedure Write_File_Transfer( Chunk : in File_Chunk; Msg : out Raw_Types.Serpent_Msg) is Pos: Integer := Msg'First; U16: Interfaces.Unsigned_16; begin -- write type ID Msg(Pos) := File_Transfer_S_Type; Pos := Pos + 1; -- write filename as text field (size+2, text) -- check against overflows if Chunk.Name_Len > Text_Len'Last - 2 or Pos + Integer(Chunk.Name_Len) + 2 > Msg'Last then raise Invalid_Msg; end if; -- write total size: filename size + 2 U16 := Interfaces.Unsigned_16( Chunk.Name_Len + 2 ); Write_U16( Msg, Pos, U16 ); -- write filename String_To_Octets( Chunk.Filename, Msg(Pos..Pos+Integer(Chunk.Name_Len)-1) ); Pos := Pos + Integer(Chunk.Name_Len); --write content -- check against overflow, including the 2 octets for counter at the end if Chunk.Len > Text_Len'Last - 2 or Pos + Integer(Chunk.Len) + 4 > Msg'Last then raise Invalid_Msg; end if; -- write total size for this text field U16 := Interfaces.Unsigned_16( Chunk.Len + 2 ); Write_U16( Msg, Pos, U16 ); -- write actual content Msg(Pos..Pos+Chunk.Content'Length-1) := Chunk.Content; Pos := Pos + Chunk.Content'Length; -- write counter Write_U16( Msg, Pos, Chunk.Count ); -- write padding if needed if Pos <= Msg'Last then RNG.Get_Octets( Msg(Pos..Msg'Last) ); end if; end Write_File_Transfer; -- The opposite of Write_File_Transfer method above. -- Counter will contain the message counter -- Chunk will contain the chunk counter, filename and content procedure Read_File_Transfer( Msg : in Raw_Types.Serpent_Msg; Chunk : out File_Chunk) is Pos: Integer := Msg'First; U16: Interfaces.Unsigned_16; S_Name, E_Name: Integer; --start/end for filename in Msg S_Len: Text_Len; -- length of filename (needed as Text_Len anyway) S_Content, E_Content: Integer; --start/end for content in Msg Content_Len: text_Len; -- length of content (needed as Text_Len anyway) begin -- read and check type ID if Msg(Pos) /= File_Transfer_S_Type then raise Invalid_Msg; end if; Pos := Pos + 1; -- read filename size Read_U16( Msg, Pos, U16 ); -- check for overflow and underflow; filename size >= 1 if Pos + Integer(U16) - 2 > Msg'Last or U16 < 3 then raise Invalid_Msg; end if; U16 := U16 - 2; S_Len := Text_Len(U16); -- set start + end for reading filename later, when ready S_Name := Pos; E_Name := Pos + Integer(U16)-1; Pos := Pos + S_Len; -- read size of content Read_U16( Msg, Pos, U16 ); -- check for overflow and underflow; content >=1; counter =2 octets if Pos + Integer(U16) - 1 > Msg'Last or U16 < 3 then raise Invalid_msg; end if; U16 := U16 - 2; Content_Len := Text_Len(U16); -- set start and end for reading content later, when ready S_Content := Pos; E_Content := Pos + Integer(U16) - 1; Pos := Pos + Content_Len; -- read counter Read_U16( Msg, Pos, U16 ); -- check chunking validity i.e. if counter>0 then no padding if U16 /= 0 and Pos /= Msg'Last then raise Invalid_Msg; end if; -- create File_Chunk structure and fill it with data from Msg declare FC : File_Chunk( Len => Content_Len, Count => U16, Name_Len => S_Len); begin -- read from Msg FC.Content := Msg( S_Content..E_Content ); Octets_To_String( Msg( S_Name..E_Name ), FC.Filename); -- copy to output var Chunk := FC; end; end Read_File_Transfer; ---- File Requests ---- procedure Write_File_Request( FR : in Filenames; Counter : in Interfaces.Unsigned_16; Msg : out Raw_Types.Serpent_Msg; Written : out Natural) is Pos : Integer := Msg'First; Max_Pos: Integer := Msg'Last - 2; -- 2 octets at end for counter Text_Sz: Integer; Max_Sz : Integer; begin -- write ID for File Request type Msg( Pos ) := File_Req_S_Type; Pos := Pos + 1; -- write Text size: filenames + separators -- consider fewer filenames if they don't ALL fit -- 2 octets are taken by size itself Max_Sz := Max_Pos - Pos - 1; Text_Sz := FR.Sz + FR.F_No - 1; if Text_Sz > Max_Sz then -- walk the array of filenames backwards and stop when they fit Written := FR.F_No - 1; -- calculate actual size written based on start of first discarded -- filename and (Written -1) octets for needed separators Text_Sz := Integer(FR.Starts(Written+1)) - FR.Starts'First + (Written - 1); -- loop until either fits or nothing left while Written > 0 and Text_Sz > Max_Sz loop Written := Written - 1; Text_Sz := Integer(FR.Starts(Written+1))- FR.Starts'First + (Written - 1); end loop; -- check that there is what to write, since nothing -> invalid message if Written = 0 then raise Invalid_Msg; end if; else --from if Text_Sz > Max_Sz -- ALL are written Written := FR.F_No; end if; -- write Text_Sz + 2 (i.e. TOTAL size) if Text_Sz + 2 > Integer(Interfaces.Unsigned_16'Last) then raise Invalid_Msg; end if; Write_U16( Msg, Pos, Interfaces.Unsigned_16(Text_Sz+2) ); -- write filenames separated by Sep for I in 1..Written loop declare Start_Pos : Positive; End_Pos : Positive; Len : Positive; begin -- current start pos in FR.S Start_Pos := Positive( FR.Starts( FR.Starts'First + I - 1)); -- calculate end based on start of next name or last if I < FR.F_No then End_Pos := Positive( FR.Starts( FR.Starts'First + I)) - 1; else End_Pos := FR.S'Last; end if; -- NB: this WILL fail if starting positions are not in order! Len := End_Pos - Start_Pos + 1; if Len <= 0 then raise Invalid_Msg; end if; --write the actual filename String_To_Octets( FR.S( Start_Pos..End_Pos ), Msg(Pos..Pos+Len-1) ); Pos := Pos + Len; --if it's not the last one, write a separator if I < Written then Msg(Pos) := Sep; Pos := Pos + 1; end if; end; end loop; -- write the message counter in little endian at all times Write_U16( Msg, Pos, Counter ); -- write padding if needed if Pos <= Msg'Last then Rng.Get_Octets( Msg(Pos..Msg'Last) ); end if; end Write_File_Request; -- Reads a request for files; the opposite of Write_File_Request above procedure Read_File_Request( Msg : in Raw_Types.Serpent_Msg; Counter : out Interfaces.Unsigned_16; FR : out Filenames) is Pos : Integer := Msg'First; Max_Pos : Integer := Msg'Last - 2; --at least 2 reserved for counter Text_Sz : Integer; Max_Sz : Integer := Max_Pos - Pos - 1; --text only i.e. w.o. size itself F_No : Integer; U16 : Interfaces.Unsigned_16; begin -- read type ID and check if Msg(Pos) /= File_Req_S_Type then raise Invalid_Msg; end if; Pos := Pos + 1; -- read total size of filenames+separators Read_U16( Msg, Pos, U16 ); Text_Sz := Integer(U16); -- take away the 2 octets for size itself Text_Sz := Text_Sz - 2; -- check that Text_Sz is not overflowing/underflowing if Text_Sz < 1 or Text_Sz > Max_Sz then raise Invalid_Msg; end if; -- count first the separators to know how many filenames -- NB: there is always at least 1 filename as Text_Sz > 0 F_No := 1; for I in Pos .. Pos + Text_Sz - 1 loop if Msg(I) = Sep then F_No := F_No + 1; end if; end loop; -- create the output structure and discard separators -- text without separators should be Text_Sz - F_No + 1 -- (because ONLY one separator between 2 filenames allowed) -- if it's not that => Invalid_Msg -- F_No and Text_Sz are not overflow (earlier check + calc) declare F : Filenames(Text_Len(F_No), Text_Len(Text_Sz-F_No+1)); S_Pos : Positive; Index : Positive; begin S_Pos := F.S'First; Index := F.Starts'First; F.Starts(Index) := Interfaces.Unsigned_16(S_Pos); for I in Pos .. Pos + Text_Sz - 1 loop -- copy over to F.S anything that is not separator if Msg(I) /= Sep then F.S( S_Pos ) := Character'Val(Msg(I)); S_Pos := S_Pos + 1; else -- if it's separator, check and if ok, add next as start if I = Pos + Text_Sz or -- separator as last character is error Msg(I+1) = Sep or -- 2 consecutive separators is error Index >= F.Starts'Last then -- too many separators is error raise Invalid_Msg; else Index := Index + 1; F.Starts( Index ) := Interfaces.Unsigned_16(S_Pos); end if; end if; end loop; -- copy the whole structure to output variable FR := F; end; -- read message counter now Pos := Pos + Text_Sz; Read_U16( Msg, Pos, Counter ); end Read_File_Request;
The converters between String and Octets (messages.adb):
-- String to Octets conversion procedure String_To_Octets(Str: in String; O: out Raw_Types.Octets) is begin Assert( Str'Length = O'Length ); for I in 1..Str'Length loop O( O'First+I-1 ) := Character'Pos(Str(Str'First + I - 1 )); end loop; end String_To_Octets; -- Octets to string conversion -- NB: Str'Length has to be EQUAL to Octets'Length! procedure Octets_To_String(O: in Raw_Types.Octets; Str: out String) is begin Assert( O'Length = Str'Length ); for I in 1..O'Length loop Str( Str'First+I-1 ) := Character'Val(O(O'First + I - 1 )); end loop; end Octets_To_String;
The read/write utilities for values on 16 bits (messages.adb):
-- Write a 16 bits value to Octets at Pos; Pos increases by 2. procedure Write_U16( Msg: in out Raw_Types.Octets; Pos: in out Natural; U16: in Interfaces.Unsigned_16) is begin Msg(Pos..Pos+1) := Raw_Types.Cast(U16); Cast_LE(Msg(Pos..Pos+1)); Pos := Pos + 2; end Write_U16; -- Read a 16-bits values from Octets from Pos; Pos increases by 2. procedure Read_U16( Msg: in Raw_Types.Octets; Pos: in out Natural; U16: out Interfaces.Unsigned_16) is O2 : Raw_Types.Octets_2; begin O2 := Msg(Pos..Pos+1); Cast_LE(O2); U16 := Raw_Types.Cast(O2); Pos := Pos + 2; end Read_U16;
The .vpatch and my signature for it can be found on my Reference Code Shelf as usual or through those links:
Yes, I managed to fall during one of various testing sessions upon the one case where it was not, such are some of my talents, what can I tell you. ↩
Comments feed: RSS 2.0
';' is a valid character in Linux filenames, along with all other possible octets with the exception of 0x00 and 0x2F, or '/'. If you're going to use a separator character, seems you might as well use null, being that it's already reserved. Otherwise, you're imposing an implicit constraint above and beyond those of the OS that you'll need to keep track of and check for. Maybe you plan to just never use ';' in a filename ever, but why even worry about it, when there's already a value that simply cannot appear in any valid filename?
However, you say "protocol's current messages", so is this a live protocol that needs to be supported going forward? If so, then one does what one must, but it's an exception to document.
The filenames will be what the server decides, not at all what linux considers "valid" or not. Not to mention that unix filenames are already a mess anyway (see the logs for full discussion on this) so not exactly something to look up to.
The above being said, I am rather for no separators at all + fixed size => loads of problems simply don't exist anymore. However, it's not yet fully clear if this will fit the game's needs and it's not yet burning, so it's still being considered essentially.