It's a question of semantics

I have designed and maintained websites for about four years. By far the most "successful" was my first: the mutation device, a website offering modifications and programming tutorials for the computer game Unreal. It's not a bad looking site, and took me some time to design. The central problem I wrestled with during its design was how to make it visually appealing and still ensure that it was easy for me to update. I am generally partial to minimalist web designs as a viewer, but I am rarely stirred by the minimalist impulse in my own designs, I guess preferring bombast. So balancing quite complex, old-school HTML (CSS? what's CSS?) with a need for flexibility (to add sections, tweak design, etc) was no easy task.

In fact, the experience largely turned me off web design. I was just becoming a programmer, and inspired by the vast empty fields in which you could design structures from the ground up with complete freedom. Modularity, extensibility, flexibility, and clarity were the repeated mantras of my programming years. Thinking in these terms, HTML seemed to me to be restrictive, muddled and generally inconsistent. It is, of course, merely a mark-up language, not a programming language, but I guess coding refined my tastes about these things a bit. Updating the site by repetitive cut and paste, making the same changes in each of 30 different files, and all of the potentials for disaster that these processes entailed (how many times did I accidentally screw up the site design in the act of adding or altering content?) was not efficient and just not fun. I would baulk at adding content I already had simply due to the effort of it.

When I built the first serious iteration of this site, I was determined to overcome this problem. The site was on a server with very little in the way of server-side functionality for site owners, and in any case I didn't much have the inclination to learn Perl, so my solution was a rather intricate system of client side page construction using Javascript. It remains a rather ingenious solution I think, I'm quite proud of it -- but it's reliance on Javascript gives it the fatal flaw of inconsistent compatibility even among modern web browsers. The old site is still up, although not necessarily fully functional anymore, right here.

What the Javascript did was separate "content" files from "design" files. It worked in a way like server-side includes do, grabbing html fragments from other files and slipping them into the main html file. But it did it on the client side, after the page was loaded in the user's browser. This made the process of adding stories to the website far easier.

The current version of this site relies on Perl to construct web pages, either on demand (as with the blog archives and the stories), or statically (as with the front page). The Perl code performs the function that the Javascript previously did, of inserting content into design templates.

Where's this going, jp? Uh, well. I don't think the server side solution in itself fully assuages my misgivings about website design and maintenance. It's an essential element of it, and the explosion of weblogs using Movable Type gives an example of how effective the cgi solution is at making things easier.

But my point is this: cgi makes things easier for now. My old programming instincts say "let's think about reusability". Enormous amounts of content are being generated in this fashion; much of it deserves a longer shelf life than the tools of its construction will perhaps allow. Thinking about forward compatibility, and more relevantly, lateral reuse today (that is, taking the content to other places than display in a browser), is essential. The current renewal of focus on accessibility in websites, very welcome in its legal and ethical manifestations, is exactly one kind of lateral reuse. There are infinitely many others for the various types of content the web offers.

Reusability demands a firm foundation of clear and sensible specifications. For the content to be endlessly reusable (as it should be), it needs to conform to widely accepted and appropriate standards. At this late stage in the Web's development, it is difficult to imagine a new standard suddenly catching on (except by force of monopoly, and though I haven't investigated it at all, the new Blogger API might do that for a particular kind of content in a particular way). But there is in fact an existing, widely accepted standard for all textual content on the web. What? Where? The answer is kinda surprising: it's HTML.

HTML's original purpose was not to create gaudy 2D carnivals in the vein of the mutation device (and there are many better examples, but let me stick with my own). It was to semantically structure text documents. (That is, to structure it in ways that are meaningful). The most important HTML elements in its original intention, it could be argued, were <h#> and <p> (and of course <a>, just to give the HT in HTML meaning). These elements were the key to defining the inherent structure of the content. In the mid-90s these visually inflexible elements went out of vogue as new elements appeared offering more visual control (and correspondingly less semantic control) -- in the case of these two tags, both were widely supplanted by the (structurally meaningless) <font> and <br> tags. Even more fatal to the reusability of content are the ubiquitous nested tables used to control screen layout. Not just a terrible hack, they can render content almost irretrievable.

The (fitful) evolution of CSS, and CSS support in major browsers, has finally put us in a position where the structural elements of HTML can be restored to text documents without mitigating our control over the content's visual appearance in user's browsers. It's not yet always pure: embedded CSS and embedded Javascript really have no place in our content files, but are commonly found there.

It's not always easy either. CSS is a powerful but irritatingly fickle beast. There are many CSS evangelists out in web design world, and for a very good reason, which the Zen Garden illustrates far better than I ever could. But it's imperfect, and it is further hampered inconsistent browser support. One of the first thing a programmer learns is an aversion to "magic numbers" -- literal constants (1,2,3 etc) scattered throughout the code. Simply because they're a bitch to update and a bitch to debug. If I want to say that something x is 512 units wide, and I want to say that something y is half as tall as x is wide, I do not want to say that y is 256 units tall. I want to say that y's height is half x's width. Then, if x's width changes, so does y's height. CSS does not have this functionality.

There are other weaknesses in CSS, but it's central benefit, of releasing content from the confines of design and anti-structuralism, cannot be undervalued.

I had a lot more to say, but this post is growing impossibly unwieldy. What I have said above, while hardly brief and to the point, pretty much encapsulates the entirity of Jeffrey Zeldman's book, Designing with Web Standards. I have been reading this recently with contrary feelings of excitement and irritation. Excitement, because the essential message of his book is an excellent one, the mission is a righteous mission (even in some ways of which he seems unwitting). Irritation, because the bastard is obsessed with the fact that he is writing a book (it is his first), and that he has something of a captive readership, who are impelled to twist into each of his excruciating "humourous" metaphors and sweat through his lengthy bouts of authorial self-consciousness. I've got a lot of respect for Mr Zeldman, and his weblog is a great resource, but damned if I'm sitting through another of his books.

(On practicing what you preach: right now I don't think I've got the standards compliant version of this site online. It does exist, but needs a once-over before going up. You won't notice any difference when it does. Besides I do already practice what I preach; you will note that all new content on this site (including all blog entries) is semantically valid, and more to the point, is semantically logical. So, perhaps unfortunately for you, it's also reusable.)

Joseph | 9 Sep 2003

Sorry, comments are not available on this post.

stuff & nonsense

  • Topographic viewTopographic view
     shows elements on a webpage according to how deeply nested they are. It's a bookmarklet for web development.
  • The qualifierThe qualifier
     renders controversial statements on this page harmless. Reinstate the slings and barbs by refreshing. Also a bookmarklet.

  • jjmap
    American Diary

    Two weeks with the apple and the lone star (illustrated).

all posts, ordered by month in reverse-chronological order:

In Words

In Other Words