Bringing Code Text to the Blog, Courtesy of VaMP



May 12th, 2022 by Diana Coman

Killing two birds with a stone here, I finally found the way to painlessly and easily produce blog articles out of code-writing, too, since writing of code is still writing after all and I really don't see any reason why those writing today's literature - also known as "coding" - should be content with being read merely by machines and not publish it on blogs, nor seek a human readership that might be - that is indeed *hoped to be* - more discerning and less easily satisfied, perhaps, but all the more rewarding to interact with! Getting to the point though, I have updated VaMP yet again, to make publishing of code for reference and discussion a natural and integral part of coding instead of some afterthought at best or horrible chore and hated imposition at worst. It took a couple of days and another dive into html and css but the effort has been worth it, seeing how the result is that the machine will do it all for me from now on, quite reliably and without any further probing, questioning or any trouble at all.

Building up concretely that desired support for collaboration between code writers, readers and publishers, VaMP produces from now on automatically and as an integral part of its use, the html file containing the full description of the codebase transformation that each applied patch stands for. This html description includes a clear marking of any created, deleted, moved or unchanged files as well as the full content of every changed file, with links provided for each line of code and html attributes that enable full control of the exact presentation form via a css file or similar.

The relevant added content of my css file to handle VaMP patches presentation currently uses alternating light yellow and beige backgrounds to visually separate consecutive lines from one another and color-codes text with gray for unchanged, blue for added, red with a line striking through the text for deleted. While there is scope quite on purpose for more fine-grained tuning of the presentation, I'd rather add any more detailed bits and parts only when and if needed, so for the time being I'm quite fine with keeping it as short and straightforward as possible1. Worth noting that the whole view even wraps correctly to screen size any longer lines, as any text viewer should do (and *not* requiring that the writer breaks the text at arbitrary places to force on anyone whatever screen width the writer enjoyed best). All of it is achieved simply with html and css and without any additional plugins or whatever else, since there really is no need for and no gain from any such added complexity.

Since a first attempt to publish the html of a patch revealed that my footnotes plugin is so eager to help that it will interpret as a footnote any repeated round brackets even in text enclosed within <pre> tags, I get to illustrate the shiny new html view of patches, most appropriately, through the latest and possibly smallest patch on VaMP itself - the addition of round brackets to the list of html-escaped characters in VaMP's shiny new html_io package:

vamp_html_brackets.patch

src/html_io.adb

2_1 
	-- DC, 2022
2_2 

2_3 
with Ada.Strings.Fixed; use Ada.Strings.Fixed;
2_4 
with Ada.Strings.Maps; use Ada.Strings.Maps; -- for the html chars
2_5 
with Ada.Strings; use Ada.Strings; -- for Trim's Both
2_6 
with Ada.Characters.Handling; use Ada.Characters.Handling; -- for To_Lower
2_7 
with Ada.Directories; use Ada.Directories;
2_8 

2_9 
with Raw_Types; use Raw_Types; -- String_To_Raw
2_10 
with Raw_IO; use Raw_IO; -- Write_Octets
2_11 

2_12 
package body HTML_IO is
2_13 
	-- packages from Raw_Types;
2_14 
	use Octets_Buffer_Pkg;
2_15 

2_16 
	procedure Write_Class( class  : in html_class;
2_17 
	                       content: in String;
2_18 
	                       fname  : in String;
2_19 
	                       append : in Boolean := True) is
2_20 
		open_str: String := Html_Open(class);
2_21 
		close_str: String := Html_Close(class);
2_22 
		cls: String := """" & Trim(To_Lower(class'Image), Both) & """";
2_23 
	begin
2_24 
		case class is
2_25 
			when patch =>
2_26 
				-- content = filename of patch, assumed html-safe
2_27 
				declare
2_28 
					cstr: String := Html_Anchor(href_str => content,
2_29 
					                            text_str => Simple_Name(content));
2_30 
					s: String := open_str & cstr & close_str & lf_str;
2_31 
					o: Elem_Array := String_To_Raw(s);
2_32 
				begin
2_33 
					Write_Octets(fname => fname, data => o, append => append);
2_34 
				end; -- patch
2_35 
			when line_no =>
2_36 
				-- content = number/code, as desired, assumed html-safe
2_37 
				-- this opens a <tr> that is closed by the next line_*
2_38 
				declare
2_39 
					nstr: String := "L" & content;
2_40 
					hstr: String := "#" & nstr;
2_41 
					cstr: String := "<tr><td><pre>" &
2_42 
					                Html_Anchor(name_str => nstr, href_str => hstr,
2_43 
					                            class_str => "line_no",
2_44 
					                            text_str => content) & " </pre></td>";
2_45 
					o: Elem_Array := String_To_Raw(cstr);
2_46 
				begin
2_47 
					Write_Octets(fname => fname, data => o, append => append);
2_48 
				end; -- line_no
2_49 
			when line_del | line_add | line_keep =>
2_50 
				-- content as relevant, will be html-escaped
2_51 
				declare
2_52 
					cstr: String := escape_html(content);
2_53 
					s : String := "<td><pre class=" & cls & ">" &
2_54 
					              cstr & "</pre></td></tr>";
2_55 
					o : Elem_Array := String_To_Raw(s);
2_56 
				begin
2_57 
					Write_Octets(fname => fname, data => o, append => append);
2_58 
				end; -- line_del|add|keep
2_59 
			when hunk_hdr =>
2_60 
				-- content as relevant, assumed html-safe
2_61 
				declare
2_62 
					s : String := "<tr><td colspan=""2""><pre class=" & cls & ">" &
2_63 
					              content & "</pre></td></tr>" & lf_str;
2_64 
					o : Elem_Array := String_To_Raw(s);
2_65 
				begin
2_66 
					Write_Octets(fname => fname, data => o, append => append);
2_67 
				end; -- others than patch or line_no
2_68 
			when others =>
2_69 
				-- content as relevant, assumed html-safe
2_70 
				declare
2_71 
					s : String := open_str & content & close_str & "<br>" & lf_str;
2_72 
					o : Elem_Array := String_To_Raw(s);
2_73 
				begin
2_74 
					Write_Octets(fname => fname, data => o, append => append);
2_75 
				end; -- others than patch or line_no
2_76 
		end case;
2_77 
	end Write_Class;
2_78 

2_79 
	procedure Write_Tag( tagname: in String;
2_80 
	                     opening: in Boolean;
2_81 
	                     fname  : in String;
2_82 
	                     append : in Boolean := True) is
2_83 
	begin
2_84 
		if opening then
2_85 
			Write_Octets(fname => fname, append => append,
2_86 
			             data => String_To_Raw("<" & tagname & ">"));
2_87 
		else
2_88 
			Write_Octets(fname => fname, append => append,
2_89 
			             data => String_To_Raw("</" & tagname & ">"));
2_90 
		end if;
2_91 
	end Write_Tag;
2_92 

2_93 
	---------------own utilities and varia
2_94 

@@ -95,7 +95,8 @@
2_95 
	function escape_html(src: in String)
2_96 
			return String is
2_97 
		-- html characters to escape and correspondence function
2_98 
		html: constant Character_Set := To_Set("<'>""&");
2_99 
		-- round brackets are escaped to avoid footnote interpretation
2_100 
		html: constant Character_Set := To_Set("<'>""&()");
2_101 
		function To_Html(c: in Character) return String is
2_102 
		begin
2_103 
			if c = '<' then
2_104 
				return "&lt;";
2_105 
			elsif c = '>' then
2_106 
				return "&gt;";
2_107 
			elsif c = ''' then
2_108 
				return "&apos;";
2_109 
			elsif c = '"' then
@@ -108,6 +109,10 @@
2_110 
				return "&quot;";
2_111 
			elsif c = '&' then
2_112 
				return "&amp;";
2_113 
			elsif c = '(' then
2_114 
				return "&lpar;";
2_115 
			elsif c = ')' then
2_116 
				return "&rpar;";
2_117 
			else
2_118 
				return "" & c;
2_119 
			end if;
2_120 
		end To_Html;
2_121 
		-- count for final max length of the string
2_122 
		plus: Natural := Count(Source => src, Set => html);
2_123 
		--lt,gt add only 3, amp 4
2_124 
		maxlen: constant Natural := src'Length + plus*5;
2_125 
		result: String(1..maxlen);
2_126 
		pos: Natural := result'First;
2_127 
		idx, len: Natural;
2_128 
		from: Natural := src'First;
2_129 
	begin
2_130 
		-- loop and copy from src
2_131 
		while from <= src'Last loop
2_132 
			idx := Index(Source => src(from..src'Last), Set => html);
2_133 
			exit when idx = 0; -- nothing found
2_134 
			-- write to res html sequence for src char and advance in src
2_135 
			declare
2_136 
				hs: String := To_Html(src(idx));
2_137 
			begin
2_138 
				-- any characters before the one to escape
2_139 
				len := idx - from;
2_140 
				if len > 0 then
2_141 
					result(pos..pos+len-1) := src(from..from+len-1);
2_142 
					pos := pos + len;
2_143 
				end if;
2_144 
				-- escaped character
2_145 
				len := hs'Length;
2_146 
				result(pos..pos+len-1) := hs;
2_147 
				pos := pos + len;
2_148 
			end;
2_149 
			-- advance in src
2_150 
			from := idx + 1;
2_151 
		end loop;
2_152 
		-- copy any remaining content from src
2_153 
		if from <= src'Last then
2_154 
			len := src'Last - from + 1;
2_155 
			result(pos..pos+len-1) := src(from..from+len-1);
2_156 
			pos := pos + len;
2_157 
		end if;
2_158 
		return result(result'First..pos-1);
2_159 
	end escape_html;
2_160 

2_161 
	function Html_Anchor( href_str: in String;
2_162 
	                      text_str: in String)
2_163 
			return String is
2_164 
	begin
2_165 
		return "<a href='" & href_str & "'>" & text_str & "</a>";
2_166 
	end Html_Anchor;
2_167 

2_168 
	function Html_Anchor( name_str: in String;
2_169 
	                      href_str: in String;
2_170 
	                      text_str: in String)
2_171 
			return String is
2_172 
	begin
2_173 
		return "<a name='" & name_str & "' href='" & href_str &
2_174 
			"'>" & text_str & "</a>";
2_175 
	end Html_Anchor;
2_176 

2_177 
	function Html_Anchor( name_str : in String;
2_178 
	                      href_str : in String;
2_179 
	                      class_str: in String;
2_180 
	                      text_str : in String)
2_181 
			return String is
2_182 
	begin
2_183 
		return "<a name='" & name_str & "' href='" & href_str &
2_184 
			"' class='" & class_str & "'>" & text_str & "</a>";
2_185 
	end Html_Anchor;
2_186 

2_187 
	-- returns the html string that opens an html element of given class
2_188 
	function Html_Open( class: in html_class )
2_189 
			return String is
2_190 
	begin
2_191 
		return "<span class='" & To_Lower(html_class'Image(class)) & "'>";
2_192 
	end Html_Open;
2_193 

2_194 
	-- returns the html string that closes an html element of given type
2_195 
	function Html_Close( class: in html_class)
2_196 
			return String is
2_197 
	begin
2_198 
		case class is
2_199 
			when line_no =>
2_200 
				return "";	-- special case as line_no is <a> and closes itself
2_201 
			when others =>
2_202 
				return "</span>";
2_203 
		end case;
2_204 
	end Html_Close;
2_205 

2_206 
end HTML_IO;

If you think perhaps that the above is "too long" given the small actual change, note that the impact of even a small change is not something that the machine can really evaluate in any meaningful way. As a consequence, VaMP simply provides as much context as it finds to one change and that means the whole file that is affected. If anything, I count this as a plus, since it even adds some very healthy incentives: every change to existing code is quite expensive for readers and users as it requires a re-loading of a potentially wider context and having at the very least the whole file to re-read for any proposed change is a good reminder of this; all the better if this also pushes one to keep code files small and clear enough so that such publishing of a patch does not result in a huge thing that nobody bothers with; finally, this is likely to penalize (correctly) huge patches as well as messy codebases since then any change touches a lot of files and so the resulting html-view of the patch balloons to include possibly most of the code, each and every time - hopefully until the codebase is finally structured more sanely.

Obviously, the above incentives are only that - they can't force anyone to change anything and one can always simply avoid the trouble thus uncovered. For instance, one can always choose and publish only excerpts rather than the full thing, since VaMP does not force nor is it intended to force publication of anything - it simply facilitates and provides all the required support to make it very easy indeed for code writers to publish their *writing* as such, and for code readers or publishers to reference and discuss a proposed change. This is exactly as it should be, since on one hand there can be valid situations when partial publishing is the most that can be done without violating other constraints of the wider context and, on the other hand, there is no imposition on the user to publish or not publish anything, only an additional provision and a better alignment of incentives.

In short, this latest improvement of my versioning tool helps with publishing too and precisely in the best possible way: automating the useful but tedious part of the task and thus improving my efficiency significantly. Yet again.


  1. For reference, here's how the relevant part of the css looks like, at the moment:

    /* for html view of code patches */
    .patch {background: Beige}
    .code_table {
    	background-color: LightYellow;
    	border: none;
    	font: 1em 'Courier New', Courier, Fixed;
    }
    .code_table td {
    	border: none;
    	padding: 0;
    	margin: 0;
    }
    .code_table tr {
    	border: none;
    	padding: 0;
    	margin: 0;
    }
    .code_table tr:nth-child(even) {background: LightYellow}
    .code_table tr:nth-child(odd) {background: Beige}
    .code_table pre {
    	word-wrap : break-word;	/* IE 5.5+ */
    	white-space: pre-wrap;	/* css 2.1 + */
    	white-space: -moz-pre-wrap; /* mozilla... */
    	white-space: -pre-wrap; /* various Opera versions */
    	white-space: -o-pre-wrap;
    
    	padding: 0;
    	margin: 0;
    
    	tab-size: 2;
    	-moz-tab-size: 2;	/* mozilla is special... */
    	font: 1em 'Courier New', Courier, Fixed;
    }
    
    .code_table a {
    	color: grey; text-decoration: none;
    }
    .dir_del, .file_del, .line_del {
    	color: red; text-decoration: line-through;
    }
    .dir_move, .file_move {
    	color: brown; text-decoration: none;
    }
    .dir_move, .dir_del, .dir_add {
    	font-weight: bold;
    }
    .file_keep, .line_keep {
    	color: grey; text-decoration: none;
    }
    .file_change {
    	color: blue; text-decoration: none;
    }
    .dir_add, .file_add, .line_add {
    	color: blue; text-decoration: none;
    }
    

     

Comments feed: RSS 2.0

11 Responses to “Bringing Code Text to the Blog, Courtesy of VaMP”

  1. Jacob Welsh says:

    Nice to see that "pre-wrap" trick got in. Any particular reason for forcing the font to Courier? (Since you're using "pre"s it should otherwise follow the user's preferred monospace font, as indeed the snippet in footnote 1 does.)

    What does the "2_" prefix in the line numbering mean - second file of a set?

    It's definitely nice to have the full-file-context change markup available at least in some view. I suppose using it as the default for publication finally provides some meaning to the file boundary and a reason for slicing the code-scroll into smaller files beyond merely "my editor/compiler/whatever tool sucks". At worst it's no more arbitrary than the magic three lines of context.

    I don't see that the "@@ -95,7 +95,8 @@" headers are doing anything useful in this view though, as it's erased the chunk boundaries they once demarcated.

  2. Diana Coman says:

    Any particular reason for forcing the font to Courier?

    I think it came from avoiding difference in size and look of the font between labels (line numbers links) and the code-content. Other than this, I don't particularly crave monospace for code and so I didn't bother to flip it the other way around but it's certainly possible. The css really is something to tweak as practice informs is most useful, neither anything to do with VaMP nor something yet crystallised as such. Do you find that Courier annoying in the context or for code?

    What does the "2_" prefix in the line numbering mean - second file of a set?

    Yes, since the html is really the view of a whole patch, line numbers have to come with a file number too. In this case the first file was the manifest and I didn't publish that because there wasn't much point to it here but well spotted!

    I don't see that the "@@ -95,7 +95,8 @@" headers are doing anything useful in this view though, as it's erased the chunk boundaries they once demarcated.

    I was/am in two minds about whether to leave that in or not, indeed. My initial view on this was that one might want to further make those into links to the exact corresponding hunk as given in the patch file if that is published too but I can't say that I have a strong case for either "they must be left in there" or "they should NOT appear in there" so kind of left it for now to see how it turns out more useful in practice. Would you really rather not have them in there?

  3. Jacob Welsh says:

    Do you find that Courier annoying in the context or for code?

    It was more just wondering why specify it at all or why specifically for code tables; but since you ask, I don't especially mind Courier but it's not my favorite; in particular it suffers from 1/l and O/0 ambiguity and has rather wide glyphs and light stroke weight compared to other screen fonts (perhaps that's just some artifact of my present setup but I think I've noticed it more generally).

    My initial view on this was that one might want to further make those into links to the exact corresponding hunk as given in the patch file if that is published too

    Ah, I could see such links perhaps being helpful to have, whether from the hunk headers or something else, since the patch, as the complete thing being signed, is the more primary source. Even without explicit links, the numbers from the header could serve as search term. At the same time, they still look kind of odd to me, sitting three lines above the actual change while the context extends past those three. So, perhaps to be determined in time and in any case a minor point.

  4. Diana Coman says:

    Thanks and good point re 1/l ambiguity so it's changed to Monospace. Updated the css for .code_table pre and added the font-family to use for td pre a, as well:

    td pre a {
    font-family: Monospace, Sans-serif;
    }

    I can see the oddity of the hunk headers, more likely use will inform as to what is most useful there and it will get changed then quite easily.

  5. Jacob Welsh says:

    I was taking a fresh look at this as an option for patch blogging; what do you think of using the "semantic" ins/del tags instead of (or in addition to) the custom element classes? (Not sure if there's any that'd fit the "changed file".) Seems to me it'd be nice to have some minimally legible output even before adding external CSS, where currently there's no way to distinguish added, deleted and unchanged lines and files. (My browser by default renders del with a strikethrough and ins with an underline.)

  6. Diana Coman says:

    I would need to check that it doesn't interfere with the element classes when they are both present.

    As you notice, there isn't a full set of such tags that one could use for the task and this is exactly why I didn't go with ins/del in the first place - because they don't solve the problem and then I'm still stuck adding more on top with the potential for the two to interfere and make an even bigger mess overall. I agree that it would be nice to have at least *some* legible output even before css but only if there is a way to make it without breaking when the css is present as well. I'll admit that html and especially the whole "semantic web" and presentation part is one of those things that I never gladly dig into.

    If you want these two tags in and you already looked at how they interact with the css when both are present, either just add them and send over the diff or let me know and I'll look into it, I don't expect it would take much.

  7. Jacob Welsh says:

    When both are present, for instance

    <pre class="line_del"><del>blabla</del></pre>

    then by browser defaults it shows for me in black with a strikethrough. It can still be styled at either the class or element level, as for instance

    .line_del { color: red; }

    or

    .code_table del { color: red; }

    however if browser defaults specific to the del element are to be changed, such as removing the strikethrough for whatever reason, it would have to use the second form. Or for maximum verbosity it would also work to put the line_del class on the del too.

  8. Diana Coman says:

    Thanks for trying those out for me and coming back with it. Indeed, when adding the ins/del tags, with the current css it's both the browser's defaults and the class-level styling that gets applied but since in this particular case it means simply that the added lines end up in blue AND underlined, I can't say I mind it enough to change the css just for it and I'm fine with supporting the ins/del tags as well since you asked for them, not a problem.

    I've added them to the latest VaMP and will deploy it shortly. To get the latest html for any patch, it's enough to simply apply the patch with the latest vamp.

  9. [...] This time, due to the significance of the work and perhaps further animated by glimpses of a shiny new code-blogging tool and its rationale, I will again present the bulk of the changes directly for fine-grained elucidation. After all, if [...]

  10. [...] enabled the development of very useful tools that pushed quite quickly for further developments and new connections that were simply not even visible before the successful regrind. This time though, the scope is [...]

  11. [...] about 8 months later, on the 5th of May 2022 and then on the 12th of May 2022, there was the addition of a tidier alternative visualisation of patches and their dependencies, followed by the addition of html formatting and output production, bridging directly code and VaMP output to the blog [...]

Leave a Reply