Notes on Computer Graphics - A Map of Sorts

January 13th, 2020 by Diana Coman

My initial pondering of the choice regarding which way to go best when looking for sense in computer graphics (CG) turned quite quickly into a rather concentrated read of ~everything I could find relevant to CG (and even beyond it). As a side result, I ended up attempting to sketch mentally for my own needs a sort of preliminary map as I could make it out from all that is scattered out there. While probably not even complete1, it still makes my head hurt to try and keep in the whole of it as it came, not even structured, not even fully processed. As a result, I find myself without much choice than to attempt here to get it out and set it down in writing mainly because I need to unload it really and otherwise simply for my own future use, for my own future reference, for whatever will be the next iteration on it, around it or even directly in opposition to it.

As a starting note, it seems to me that in current CG as in ~everything else nowadays, there tends to be such narrow focus - such overfitting of methods and even of thoughts really - of those involved that one would easily think there's no coherent bigger picture even possible or ever desired. Everything is split and then split again for good measure and yet again, until one loses count of all those splits and perceives as a result just a tiny bit of the whole as if it were a world - an island really - in and by itself. There are effectively people that describe themselves as "animators" or "modelers" or "riggers", confining and willingly limiting themselves thus to what amounts simply to different stages of the production pipeline in just *one* approach within one branch of the wider CG. And while I get the appeal of such narrowness of focus - it keeps things easier and less overwhelming - I find it both abhorrent and ill-fitted to humans really, it's literally aiming to become an overspecialised and overfitted piece of machinery within an industrial process, how can one live - and moreover, want and even aim to live - like this in the first place?2

Putting aside my questioning of this overfitting approach and switching for now to the point of view of someone aiming to explore the CG domain as a whole, one practical result of the above overspeciliasation is that any sort of mapping has to start mostly from ground up, collecting bits and pieces and gradually figuring out the specific terms that each separated island within the CG wider domain employs for its own needs. Since the whole exploration happens in this Internet world that is made essentially of words and not much more, finding those specific terms for each narrow domain is quite crucial too - the equivalent of discovering the way to each island of relevant meaning lost on the sea of irrelevancy. Nevertheless and rather uplifting, even the CG world does not seem to have been - or indeed to need be - at all times this fragmented. Most notably, I was able to find3 for instance the refreshing book of Glassner from 19944 that provides an enjoyable and quite comprehensive discussion of computer graphics as the result of combining three previously distinct domains: the human visual system, digital signal processing, the interaction of matter and light.

As far as I understand it, the image synthesis approach to computer graphics aims to identify the relevant known physical aspects and then develop corresponding computer models - the key words here being physical and models: on one hand the whole approach is explicitly informed by the current understanding of reality and on the other hand, its aim is to effectively translate this to the relatively new and certainly more restrictive domain of computing5. Even within image synthesis, one inevitable split that is readily noticed is that between geometry (3D) and imagining (2D representations of 3D geometry) or the corresponding processes of modeling (obtaining the desired shapes) and rendering (translating them to the 2D image that is displayed on a computer screen).

The split between modeling and rendering already creates its own smaller inner islands of its own, on which even the same terms come with different definitions as they are seen from different perspectives. For starters, the very humble pixels themselves are not quite the same thing: from a modeling perspective, pixels get defined as geometrical areas intersected by shapes; from a rendering perspective though, they are seen as samples of a continuum. This very difference further feeds into any related notions such as alpha for instance: from a modeling perspective, alpha represents percentage coverage by an object; from a rendering perspective, the same alpha is simply opacity. The widespread "definition" of alpha as just "opacity" thus simply reflects the current predominance of rendering over modeling: because ultimately the whole point of CG is to get 2D representations on screens, it would seem that the rendering itself can steal the whole show at times, leaving the modeling in the shadows as a sort of unloved preliminary step.

Walking down the branches of modeling and rendering reveals further splits that seem to be increasingly exploratory rather than strictly inevitable forks in the road. Nevertheless and perhaps unsurprisingly there is always and on all levels this clear separation and even tension between approaches that focus on further developing that translation of physical models to the computer domain versus approaches that focus on the narrow treatment of specific cases. On the modeling branch, this tends to play out as far as I see it as mathematical descriptions of shapes and surfaces (vector representations, constructive solid geometry and various geometric algebras6 ) versus enumerations of vertices and triangles or polygons (rasterised formats essentially). On the rendering branch, there are on one hand the ray-tracing approaches that aim for physical correctness7 and on the other hand the rasterization approaches that aim first and foremost for speed and fit to existing GPUs8.

Going further down this tree of graphics splits, there are of course materials and textures and animations, as main categories, even without going into specifics such as rendering of fonts. At this level things really tend to explode in numbers, mainly to fit all sorts of quirks of tools and GPUs. Looking strictly at the actual methods employed rather than the output formats or specific bits of GPU programming, there seems to be considerably less variety: the bulk of materials and textures can be neatly split into programmatically vs manually generated; animations are essentially defined as a set of different poses of the character, with interpolated movement between the poses to create smooth transition.

There are just a bunch of approaches to calculating lighting and at least on the non-pbr side, it would seem that the main approach remains to simply calculate the normal vectors at all vertices, calculate illumination at all vertices and then either render with only this information or interpolate it across the triangles/polygons that are used to make up all surfaces. The Gouraud shading model interpolates lighting values at each vertex across each triangle. The Blinn-Phong shading model interpolates instead the normals at each vertex, the direction of the light source and the direction of the camera across each triangle, calculating the lighting at each pixel. There is a further split regarding what type of illumination is used: the older, "classical" approach uses diffuse and specular as properties of a material/texture while a newer approach more rooted in pbr seems to use "metalness" as well as a base colour and some specular factor instead. With respect to shadows, the focus seems to be still on the old options of either stencil shadows (calculating on the fly the volume of space for which an object blocks the light from a source)9 or precomputed shadow maps (those store the depth of field for each pixel from a given perspective).

The main splits with respect to animation seem to be simply in terms of what data is actually stored for each pose: the most common approach seems to store the character's skeleton described as a hierarchical structure of bones on which some skin/texture is painstakingly attached at all relevant points; alternatively, it's possible to store vertices's positions for each pose instead of bone structures (especially for animating items that are rather boneless intrinsically, such as blobs perhaps). At an even lower level, one can further look into where is the data stored and how much takes place on the GPU directly (aka in a "shader"). In principle one can also mix skeleton and vertex-based animations for the same character, of course.

Surprisingly missing to my eye from the approaches on both modeling of shapes and on animation for CG is anything that aims even in the least degree to describe them at a higher level. For animation, this would mean anything higher level than this magic lantern approach10 of successive drawings of the subject. For shapes, this would really mean anything that actually models a group as opposed to painting one single instance. And this focus on the low-level instance by instance production process seems to me to be quite related to the excessive focus on rendering as main concern: the whole of modeling seems to lose sight of having any specific meaning on its own, outside of preparing stuff "as best as possible" for effective rendering. To recover such meaning, I had to go either to other domains11 or to some islands that still focus on actually modeling in the proper sense of the term12.

One quite interesting such island for modeling 3D shapes13 seems to successfuly describe through parametric equations the entire surface of various types of molluscs and echinoids. The approach supposedly works both as a sort of reverse engineering of known organisms and as exploratory approach as to potential phenotypic variation of each species. They certainly have pretty pictures of quite a few shells!

With regards to animation islands, my quick search turned out Laban's Movement Analysis (LMA) framework that was initially designed for coreography and the theatre. I found it quite interesting as it aims precisely to provide a generic model (and therefore the useful terms) of human movement. Based on the original theory, there seem to be quite a few systems developed at later stages by others, such as the BESS (Body, Effort, Shape and Space) by Bartenieff. With respect to movement only, LMA models it as a combination of only four categories with 2 possible elements each (Space for focus, with possible elements direct and indirect, Time with quick or sustained, Weight with heavy or light and Flow with bound or free). Combining those 4 components and 2 elements yields 8 basic efforts neatly shown in a table:

Space/Focus Time Weight Flow
Punch Direct Quick Heavy Bound
Dab Direct Quick Light Bound
Press Direct Sustained Heavy Bound
Glide Direct Sustained Light Free
Slash Indirect Quick Heavy Free
Flick Indirect Quick Light Free
Wring Indirect Sustained Heavy Bound
Float Indirect Sustained Light Free

While the table above might not necessarily translate directly or need to be taken literally for defining movements in game, I admit that I like its neatness and focus on modeling the movement rather than the appearance of that movement. So perhaps it makes more sense to have graphical representations for each character for a set of predefined types of movement (even for each limb if needed) and then simply allow the client to choose whatever specific representation set it wants to use at any given time. At any rate, this is just a summary of what I found as most useful out of my (otherwise much wider) initial exploration so meant mainly as an initial foothold rather than a full base of anything as such. Any direction deemed worth of more detailed investigation can be pursued at a later time.


  1. Do leave me a comment there at the bottom of this article if you have something to add, to point out or to anything really, as long as it's related and you are willing to actually stand by it. Talking to those who know more about a topic is precisely how learning goes, so talk to me if you know more about it and equally well talk to me if you know less about it, all right? 

  2. This overfitting of people also strikes me even more of a sort of meat being cheaper than electronics even, since apparently it's people aiming to become more like robots rather than the other way around. For all the concerns about and long discussions around the idea of computers gaining consciousness and becoming more human-like, it strikes me all of a sudden that there has been remarkably little - none that I'm really aware of, really - mirroring discussion of what seems to be a way more real trend of humans gaining (is it a gain?) the desire and aim of being more machine-like. 

  3. After I figured out the key words "image synthesis". 

  4. Andrew Glassner, Principles of Digital Image Synthesis, Morgan Kaufmann, 1994 

  5. Glassner builds up gradually to the radiance equation and notes both its power for image synthesis (as an expression of the distribution of light within a scene) and the trouble with it (the implicit rather than explicit description of radiance distribution).  

  6. As mentioned in #eulora, I fell in love with Grassmann algebra, yes. Still, it's not the only option out there and there are even some interesting explorations - such as "greping for shapes" as a way to model - and some interesting tools such as Urbanek's PlastiSketch that uses vector representation to design 3D models - though sadly it's exclusively Java-based. 

  7. Those seem to be used as such mainly in animation films and/or dedicated rendering setups as they tend to be very costly in terms of both computing power and time. A quite nicely published example of pbr - with "literate programming" claim, no less! - if you ever want to play with that is PBRT, as published by Pharr, Jakob and Humphreys. Essentially the "pure" physically based rendering - pbr - focuses more on exploring what might be achievable with this approach, regardless of whether or to what extent it is also practical currently. So the computer games part of CG is kind of stuck trying at most to pick up from the advances in pbr some bits and pieces that might - or might not - fit as improvements to non-pbr approaches. 

  8. Basically a lot of information is precomputed and stored for each vertex and/or triangle/polygon in each object or scene so that rendering consists mainly in converting all that to the 2D image on screen often through multi-step passes that increasingly tweak colours of various pixels as more information is considered. It's here that the whole shaders mess comes into play too. 

  9. Crow, Frank, "Shadow Algorithms for Computer Graphics", Proc. of SIGGRAPH, 1977, pp.242-248 

  10. That's first mentioned in the 1650s, Giovanni Fontana, if memory serves! 

  11. E.g. GIS aka geospatial information systems that obviously care quite a lot about meaning of the model and relegate rendering to its subservient place where it belongs, wtf is this worship of flat images now otherwise! 

  12. Aka making a model of something as opposed to just a drawing of an instance of it, ffs! 

  13. Pappas, J.L. and Miller, D.J., "A Generalized Approach to the Modeling and Analysis of 3D Surface Morphology in Organisms"