~ This is part of the EuCrypt series. Start with Introducing EuCrypt. ~
EuCrypt will use Keccak for all its hashing and RSA padding needs, as per TMSR-RSA specification. In this age of ever-mutating labels on top of labels though, I have to clearly state that EuCrypt will actually use Keccak itself and not SHA-3 or whatever other thing the original Keccak currently morphed into. More specifically and true to its name, EuCrypt's Keccak is a direct implementation of The Keccak Reference, Version 3.01. This means of course that I have to do the implementation from scratch since history is apparently re-written on the webs rather than preserved: current keccak website morphed at some point2 to the "new keccak" aka SHA-3, simply wiping the previous code from sight. Why would one assume their own past, why would one even need or want or try to actually follow the evolution of anything at all, right? There is apparently only "now" on the wide webs outside TMSR and everything is just re-written and replaced with no traceability to speak of.3
On the bright side, this Keccak implementation does not need to rely on the rather unreliable MPI or any other existing parts for that matter. Moreover, it is meant to work perfectly fine as part of EuCrypt itself but also as a standalone component. Consequently, I'm in the happy position of being able to gladly discard C/C++ as programming language and proceed in a much saner and altogether more pleasant to use language: Ada. The discussion of this choice and of Ada itself is outside the scope of this series but the interested reader can find quite a lot on this topic in the logs. This choice of a new programming language comes with its own challenge of course, as Ada is quite new to me, but this is the sort of challenge that you should actively search for and run after rather than the sort to run away from and turn down when it appears. So I'll write it in Ada - nothing better than a pointed need to actually learn something anyway.
Still on this pleasantly bright side, the task of implementing Keccak in Ada is relatively straightforward to start with, mainly due to the clear description of the Keccak permutations in the reference paper. It also helped to be able to play around a bit in the beginning with a previous attempt at Keccak in Ada, by Peter Lambert. Although I discovered a few problems with that initial attempt, it was nevertheless quite useful as a stepping stone to better understand what Keccak is about and what sort of troubles one might have when implementing it in Ada. Once that initial needed exploration was done I proceeded however to implement Keccak separately, from scratch and this new version of mine is the one I will focus on here. In other words, if there are any errors in this code they are entirely mine.
According to the reference paper, Keccak "is a family of sponge functions that use as a building block a permutation from a set of 7 permutations"4. If that doesn't clarify the issue, it's mainly because hash functions are essentially voodoo or in other words, *what* they really do is not all that clearly (as in mathematically) proven anywhere. That aside, *how* this unknown actual effect is achieved is quite clearly defined. In a nutshell, the working is this: a stream of input bits (the original text) are loaded into a 3D structure (essentially a cuboid made of bits, think of a larger, non-standard Rubik's cube where instead of colours you have bits in each cell and the range of permitted movements is not limited by physical considerations); this Keccak cuboid (called "state") is then scrambled by means of 5 transformations applied in a pre-defined order; the same sequence of 5 transformations is then repeated several times with different constants affecting each repetition (each full set of transformations is called a round); the resulting scrambled bits can be extracted then back into a stream that would represent the hash of the original text. Note that the "7 permutations" mentioned in the definition are not the 5 transformations of the state. Instead, the idea is that Keccak itself, as a whole, acts as a permutation over a number of bits b, where b can take 7 distinct values (hence the 7 permutations). Essentially there is only one mechanism but it can work on 7 different sizes of a bitstream.
The first step (the part that this chapter covers) is to implement therefore the Keccak permutation, meaning in more detail precisely this "state" structure and the 5 transformations that can work with it. I'll start by defining the needed knobs, constants and types, all of them part of the SMG_Keccak package in a new file eucrypt/smg_keccak/smg_keccak.ads:
-- S.MG implementation of Keccak-f permutations -- (Based on The Keccak Reference, Version 3.0, January 14, 2011, by -- Guido Bertoni, Joan Daemen, Michael Peeters and Gilles Van Assche) -- S.MG, 2018 package SMG_Keccak is pragma Pure(SMG_Keccak); --stateless, no side effects -> can cache calls --knobs (can change as per keccak design but fixed here for S.MG purposes)-- Keccak_L: constant := 6; --gives keccak z (word) dimension of 2^6=64 and --therefore keccak function 1600 with current --constants (5*5*2^6) --constants: dimensions of keccak state and number of rounds XY_Length: constant := 5; Z_Length: constant := 2**Keccak_L; Width: constant := XY_Length * XY_Length * Z_Length; N_Rounds: constant := 12 + 2*Keccak_L; --types type XYCoord is mod XY_Length; type ZCoord is mod Z_Length; type Round_Index is mod N_Rounds; type ZWord is mod 2**Z_Length; --"lane" in keccak ref type Plane is array(XYCoord) of ZWord; --a "horizontal slice" of keccak state type State is array(XYCoord, XYCoord) of ZWord; --the full keccak state type Round_Constants is array(Round_Index) of ZWord; --magic keccak constants
The "pragma Pure" line at the beginning of the SMG_Keccak package indicates the fact that this implementation of Keccak is made on purpose to be *stateless*. This means that none of the procedures and functions in the package affect any global variables or state, indeed that there are no such global variables or state(s) in the first place. If this strikes you as odd given that Keccak itself has effectively states (through the distinct rounds for instance and further deep down the different transformations that have to be applied in a pre-defined order) note that those are at most *internal* states of Keccak rather than external and there is no reason whatsoever for those "states" to be visible from outside or indeed to be actually stored as such. Each procedure and function in SMG_Keccak really operates on a *given* (as opposed to stored) state (and round constant for the iota function) producing another state, without any need to rely on anything else. In other words, there are no side effects of calling SMG_Keccak functions/procedures. As I am indeed very happy with *not* having any side effects if I can help it at all, that pragma stays exactly where it is.
As you can notice above further in the code after the pragma, there is a single knob for the user to play with, namely Keccak_L or length. The value of this knob however is used to effectively choose one of the "7 permutations" meaning in practice to calculate the number of Keccak rounds (i.e. how many times the full set of 5 transformations are applied), the Z dimension of the Keccak cuboid (Z_Length) and consequently the total width (i.e. how many bits can fit at any one given time). By adjusting this knob, the user can obtain a wider or narrower Keccak cuboid, trading to some extent width for speed (since there are fewer bits and also fewer rounds for a smaller width). However, the other 2 dimensions (X and Y, named for convenience) are fixed at 5, as per Keccak reference documentation. Similarly, the number of rounds takes the length knob into account but it is nevertheless at least 13, as an absolute minimum.
The types defined in the code above take advantage of one of Ada's very useful approaches: each type really spans the set of values that are valid for the intended use and nothing else. For instance, XY_Coord is defined as a modular type, based on XY_Length. This means that valid values of XY_Coord type are only 0 to XY_Coord -1 and moreover, any calculations with XY_Coord type will be considered modulo XY_Length. No need for further code to check on this, no headaches having to check again and again explicitly at all times5 that only this subset of values are valid X, Y coordinates: it's enough to define the type properly here and then simply use it throughout as intended!
Similarly to XY_Coord, there is Z_Coord as modular type with only difference that this is modulo Z_Length, since the Z dimension is not fixed and potentially different from X/Y dimensions. Using those, the Keccak cuboid is defined as "State": a matrix of ZWords, where each ZWord is of length 2^Z_Length (i.e. contains Z_Length bits). The additional type Plane represents a horizontal "slice" of the cuboid and is defined for convenience since it comes in very handy for some of the permutations later on. Note that the reference documentation defines vertical slices as well but I did not find (at least not yet) any actual need for them, so I did not include them as separate types.
The next part of the same file contains the definition of the internal constants and methods of Keccak:
private -- these are internals of the keccak implementation, not meant to be directly -- accessed/used --Keccak magic numbers RC : constant Round_Constants := ( 16#0000_0000_0000_0001#, 16#0000_0000_0000_8082#, 16#8000_0000_0000_808A#, 16#8000_0000_8000_8000#, 16#0000_0000_0000_808B#, 16#0000_0000_8000_0001#, 16#8000_0000_8000_8081#, 16#8000_0000_0000_8009#, 16#0000_0000_0000_008A#, 16#0000_0000_0000_0088#, 16#0000_0000_8000_8009#, 16#0000_0000_8000_000A#, 16#0000_0000_8000_808B#, 16#8000_0000_0000_008B#, 16#8000_0000_0000_8089#, 16#8000_0000_0000_8003#, 16#8000_0000_0000_8002#, 16#8000_0000_0000_0080#, 16#0000_0000_0000_800A#, 16#8000_0000_8000_000A#, 16#8000_0000_8000_8081#, 16#8000_0000_0000_8080#, 16#0000_0000_8000_0001#, 16#8000_0000_8000_8008# ); --gnat-specific methods to have bit-ops for modular types function Rotate_Left( Value : ZWord; Amount : Natural) return ZWord; pragma Import(Intrinsic, Rotate_Left); function Shift_Right( Value : ZWord; Amount : Natural) return ZWord; pragma Import(Intrinsic, Shift_Right); --Keccak permutations function Theta ( Input : in State) return State; function Rho ( Input : in State) return State; function Pi ( Input : in State) return State; function Chi ( Input : in State) return State; function Iota ( Round_Const : in ZWord; Input : in State) return State; --Keccak full function with block width currently 1600 (Width constant above) --this simply applies *all* keccak permutations in the correct order and using -- the keccak magic numbers (round constants) as per keccak reference function Keccak_Function(Input: in State) return State; end SMG_Keccak;
In the internals (private part) of the SMG_Keccak package, there are first the actual values of the constants that essentially differentiate each round from the others. Those are for all intents and purposes magic numbers, no way around it, so they get called in the code precisely that: Keccak magic numbers. After those, there are two gnat-specific methods imported for bit rotation and bit shifting of modular types. While I still don't like these imports, I don't have a good alternative for now, so there they are. Finally, the actual Keccak transformations follow, each of them taking as input a Keccak State (and in the case of the last transformation, iota, also a round constant) and providing as output another Keccak State. The Keccak_Function that follows applies the 5 transformations (theta, rho, pi, chi, iota) in the correct order and moreover iteratively and with the correct constants as required by the pre-established (for this particular Keccak permutation) number of rounds.
The implementation of all the above Keccak transformations and function can be found in eucrypt/smg_keccak/smg_keccak.adb. The code should be relatively easy to follow as it adheres quite closely to the pseudo-code given in the Keccak reference:
-- S.MG, 2018 package body SMG_Keccak is function Theta(Input : in State) return State is Output : State; C : Plane; W : ZWord; begin for X in XYCoord loop C(X) := Input(X, 0); for Y in 1..XYCoord'Last loop C(X) := C(X) xor Input(X, Y); end loop; end loop; for X in XYCoord loop W := C(X-1) xor Rotate_Left(C(X+1), 1); for Y in XYCoord loop Output(X,Y) := Input(X,Y) xor W; end loop; end loop; return Output; end Theta; function Rho(Input : in State) return State is Output : State; X, Y, Old_Y : XYCoord; begin Output(0,0) := Input(0,0); X := 1; Y := 0; for T in 0..23 loop Output(X, Y) := Rotate_Left(Input(X,Y), ( (T+1)*(T+2)/2) mod Z_Length); Old_Y := Y; Y := 2*X + 3*Y; X := Old_Y; end loop; return Output; end rho; function Pi(Input : in State) return State is Output: State; begin for X in XYCoord loop for Y in XYCoord loop Output(Y, 2*X + 3*Y) := Input(X, Y); end loop; end loop; return Output; end pi; function Chi(Input : in State) return State is Output: State; begin for Y in XYCoord loop for X in XYCoord loop Output(X, Y) := Input(X, Y) xor ( (not Input(X + 1, Y)) and Input(X + 2, Y) ); end loop; end loop; return Output; end chi; function Iota(Round_Const : in ZWord; Input : in State) return State is Output: State; begin Output := Input; Output(0,0) := Input(0,0) xor Round_Const; return Output; end iota; function Keccak_Function(Input: in State) return State is Output: State; begin Output := Input; for I in Round_Index loop Output := Iota(RC(I), Chi(Pi(Rho(Theta(Output))))); end loop; return Output; end Keccak_Function; end SMG_Keccak;
To round this off, all we need is a way to compile everything. If you are using gnatmake, it's quite straightforward as it's only a file for now. However, for the future and in the interest of making choices explicit, I've wrote a .gpr file as well (eucrypt/smg_keccak/smg_keccak.gpr), for use with gprbuild:
-- S.MG, 2018 project SMG_Keccak is for Languages use ("Ada"); for Library_Name use "SMG_Keccak"; for Library_Kind use "static"; for Source_Dirs use ("."); for Object_Dir use "obj"; for Library_Dir use "lib"; end SMG_Keccak;
As usual, an implementation cannot really be published without any tests at all, so there is a tests folder too, containing two text files with one distinct test case each (taken from keccak archives that I managed to find) and the corresponding Ada testing code that reads those text files, runs the Keccak implementation and reports at each step if the expected and actual outputs of each transformation match or not. As usual again, this is of course more code than in the implementation that it tests, apparently I can't escape this. Moreover, a lot of it is the faffing about with parsing the input since the original "format" of the test vectors does not strike me at all as particularly friendly for automated tests. The whole testing code is quite strict on having only one single test case per file as well as some specific markers in the text itself to be able to identify correctly each round and state. Nevertheless, strict and long as it is, you can find it all in eucrypt/smg_keccak/tests/smg_keccak.adb:
with SMG_Keccak; use SMG_Keccak; with Ada.Exceptions; use Ada.Exceptions; with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.Fixed; use Ada.Strings.Fixed; with Interfaces; use Interfaces; procedure SMG_Keccak.Test is --types type Keccak_Perms is (None, Theta, Rho, Pi, Chi, Iota); type Test_Vector is array(Keccak_Perms) of State; type Test_Round is array(Round_Index) of Test_Vector; --helper methods procedure print_state(S: in State; Title: in String) is Hex: array(0..15) of Character := ("0123456789ABCDEF"); Len: constant Natural := Z_Length / 4; HexString: String(1..Len); W: ZWord; begin Put_Line("---------" & Title & "---------"); for Y in XYCoord loop for X in XYCoord loop W := S(X,Y); for Z in 0..Len-1 loop HexString(Natural(Len-Z)) := Hex(Natural(W mod 16)); W := W / 16; end loop; Put(HexString & " "); end loop; Put_Line(""); end loop; end; function read_state(File: in FILE_TYPE; Oct: Positive :=8) return State is S: State; Line1: String := "0000000000000000 " & "0000000000000000 " & "0000000000000000 " & "0000000000000000 " & "0000000000000000"; StartPos, EndPos: Positive; Len: Positive := Oct*2; begin for Y in XYCoord loop Line1 := Get_Line(File); StartPos := Line1'First; EndPos := StartPos + Len-1; for X in XYCoord loop S(X,Y) := ZWord'value("16#" & Line1(StartPos..EndPos) & "#"); StartPos := EndPos + 2; --one space to skip EndPos := StartPos + Len - 1; end loop; end loop; return S; end read_state; --reads a full test round from specified file (pre-defined format) function read_from_file (filename : in String; T : out Test_Round) return Boolean is file: FILE_TYPE; InputMarker: String := "lanes as 64-bit words:"; octets: Positive := 8; RoundNo: Round_Index; begin -- try to open the input file begin open(file, In_File, filename); exception when others => Put_Line(Standard_Error, "Can not open the file '" & filename & "'. Does it exist?"); return False; end; -- find & read input state first RoundNo := -1; loop declare Line: String := Get_Line(file); begin --check if this is test data of any known kind if index(Line, InputMarker, 1) > 0 then T(0)(None) := read_state(file, octets); print_state(T(0)(None), "Read Input State"); elsif index(Line, "Round ", 1) > 0 then RoundNo := RoundNo +1; elsif index(Line, "theta", 1) > 0 then T(RoundNo)(Theta) := read_state(file, octets); if (RoundNo > 0) then T(RoundNo)(None) := T(RoundNo-1)(Iota); -- previous state as input end if; elsif index(Line, "rho", 1) > 0 then T(RoundNo)(Rho) := read_state(file, octets); elsif index(Line, "pi", 1) > 0 then T(RoundNo)(Pi) := read_state(file, octets); elsif index(Line, "chi", 1) > 0 then T(RoundNo)(Chi) := read_state(file, octets); elsif index(Line, "iota", 1) > 0 then T(RoundNo)(Iota) := read_state(file, octets); end if; exit when End_Of_File(file); end; end loop; Close(file); return True; end read_from_file; -- performs one single round of Keccak, step by step -- each permutation is tested separately -- test fails with exception raised at first output not matching expected procedure test_one_round(T: Test_Vector; Round: Round_Index) is Input: State; Expected: State; Output: State; Test_One_Round_Fail: Exception; begin Input := T(None); for I in Keccak_Perms range Theta..Iota loop Expected := T(I); case I is when Theta => Output := SMG_Keccak.Theta(Input); when Rho => Output := SMG_Keccak.Rho(Input); when Pi => Output := SMG_Keccak.Pi(Input); when Chi => Output := SMG_Keccak.Chi(Input); when Iota => Output := SMG_Keccak.Iota(RC(Round), Input); when others => null; end case; if (Output /= Expected) then print_state(Output, "----------real output-------"); print_state(Expected, "----------expected output--------"); raise Test_One_Round_Fail; else Put_Line("PASSED: " & Keccak_Perms'Image(I)); end if; -- get ready for next permutation Input := Expected; end loop; end test_one_round; -- end of helper methods --variables T: Test_Round; begin Put_Line("-----Testing with zero state as input------"); if (not read_from_file("testvectorszero.txt", T)) then return; end if; for I in Round_Index loop Put_Line("---round " & Round_Index'Image(I) & "---"); test_one_round(T(I), I); end loop; Put_Line("-----Testing with non-zero state as input------"); if (not read_from_file("testvectorsnonzero.txt", T)) then return; end if; for I in Round_Index loop Put_Line("---round " & Round_Index'Image(I) & "---"); test_one_round(T(I), I); end loop; end SMG_Keccak.Test;
The .gpr file for building the test suite that can then be simply executed:
-- Tests for SMG_Keccak (part of EuCrypt) -- S.MG, 2018 project SMG_Keccak_Test is for Source_Dirs use (".", "../"); for Object_Dir use "obj"; for Exec_Dir use "."; for Main use ("smg_keccak-test.adb"); end SMG_Keccak_Test;
The .vpatch and its signature for this chapter are made quite on purpose to have only the genesis of EuCrypt as ascendant. This reflects the fact that smg_keccak itself does not depend on mpi or smg_rsa. It also allows any users of smg_keccak to potentially take just the smg_keccak tree if that's all they need out of EuCrypt. So the next chapters will build on this one and essentially further develop the smg_keccak branch of the big EuCrypt tree. When the whole smg_keccak is ready, a unifying .vpatch will bring everything back together into a common trunk, as everything gets used together as intended. Until then, here's this first keccak .vpatch and its signature:
In the next chapter I'll further expand this Keccak implementation so stay tuned!
Bertoni, G., Daemen, J., Peeters, M. and Van Assche, G., 2011. The Keccak Reference. Version 3.0 ↩
As the man says: holly shit the original keccak www is gone>. ↩
Why exactly is this so? Try and tell yourself why. From where I am, I can only really see a whole army of underlings (not even employees for they do this dirty work unpaid even, for the saddest part of it) at the Ministry of Truth busily at work, what is there more to say about it. ↩
Bertoni, G., Daemen, J., Peeters, M. and Van Assche, G., 2011. The Keccak Reference, Version 3.0, p. 7 ↩
of course one still keeps that in mind, but it doesn't have to be at the forefront at all times since there's no need to re-implement the check each time coordinates are used. ↩
Comments feed: RSS 2.0
Diana, I'm finding this a delightful reference as I work on my own Keccak implementation. A question: why did you choose to print out the Y coordinate before the X in print_state?
Mainly because I followed quite literally as close as possible the reference description of Keccak and they read line by line rather than column by column. To me the result makes some intuitive sense although the for loops seem "inverted".
I did notice you ended up with some trouble and it might be worth to check you don't invert somewhere lines/cols? While the whole thing is in principle symmetric on X,Y, it can cause trouble if you flip X and Y coords in only one place (and it's quite easy for it to happen at implementation time).
Glad to hear it's readable and useful!
I see! First, the Keccak reference gives the state a left-handed coordinate system, which threw me inside-out. Second, as you said, it assigns the first coordinate to the rows of a slice matrix. Any idea why they would design the system like this?!
Also, my trouble was a few low-pain bugs, and my tests now match yours! (Transposed, that is).
My guess is as good as yours there - no idea what went on in their mind since they didn't share it anywhere I am aware of. The only thing I can say is that the Reference is not exactly a good spec, mainly because of this sort of trouble - I tripped over this too and then repeatedly over the bit order assumptions that are not made explicit anywhere and even a few other bits and pieces that I don't even directly remember without going through it again.
Glad to hear you found and corrected the errors.
[…] effect, working at bit level means also that there really is no need anymore for importing those gnat-specific methods that I never wanted in the first place. As to the “but it’s going to be […]
[…] only Republican-signed Keccak implementation is smg-keccak, a piece of the EuCrypt library by Diana Coman. How could the smg-keccak be incorporated into […]
[…] For comparison, here are the first three rounds of the empty state package with Diana Coman's Keccak: […]
[…] vector of bits. And after fixing two bugs in last week's post, my state transitions match those of s.mg keccak to the […]
[...] introduces Keccak into the refactored esthlos-v. The inner workings of Keccak have already been well explained, including operations at the bit-level and trouble with endianness, so I'm not going to [...]
[...] necessity. Meanwhile, Diana Coman produced and incrementally published a very nice and documented reference implementation in Ada, which was adopted for use in V and soon became [...]