Wednesday, February 18, 2009

The tubes are on fire!

Or, Eric Meyer just agreed that CSS just doesn't get the job done sometimes. Empirically, as somebody pointed out on reddit, we see 15 of the top 20 webpages (worldwide) use tables for layout, instead of relying upon typical CSS. After doing a whole lot of commercial web design back when the table-less layout religion first came about, and doing almost entirely web-related research for the past 4 years, why this is such a big deal is starting to make a lot more sense. Trying to write the core semantics of CSS and then parallelizing it for the past half year has made me uncomfortably intimate with the what and why, and colors how I think about alternatives.

First, what is CSS? Cascading Style Sheets. Intuitively, it's a language to describe how an HTML tree is styled. More technically, it's:
  1. A rule language: a style sheet is a set of rules, where each rule might be triggered on different document nodes, and add style constraints to them (e.g., set the width to be 50% of the parent's)
  2. A constraint language for specifying these constraints. I've been unable to bound its expressiveness (sorry, only tried for 1-2 evenings), but the important part is typical constraints are either for how a line of text appears or to achieve a flow-based layout. In a flow-based layout system, generally, one sibling shows up after the previous, and nesting is supported. In CSS, exceptions to this have steadily grown over time.
Judging it by these two traits, a better question is "why CSS?":
  1. Separating style from data by using rules has at least two basic purposes:
    1. Simplify in-house development: it facilitates making sweeping changes and uniform designs, which is important for styling any non-trivial site in a maintable way.
    2. Simplifies analysis: whether a human is reading the source or a machine, cleaning up the clutter is a big deal. A search engine, or any other agent that must generically extract data from a site and use it in a different way, such as a screen reader for the visually impaired, can easily get tripped up nearby tags in the data tree. Having those style-related tags there, or even a non-data-related arrangement of your data tree, just for visual appearence, should be an easily avoidable reason to break this eco-system.
  2. It describes what we need. A flow-based layout makes sense when you think about the majority of the web and historical context. It was made for laying out documents, character after character, word after word, sentence after sentence, paragraph after paragraph, etc. That's what, almost all of the web? Competing ideas around the time were things like geometric constraints that are more apropos for industrial tasks like modeling car parts -- vector tools on the web are a more modern pursuit.

There's really a bit more to it. For example, I'm not sure when it started -- I think a bit after the time there was sufficient CSS compatibility between browsers -- but there's been a push for a long time to twist CSS to describe fluid layouts: allow the layout to adapt to different screen size constraints, such as for handhelds (this was before they were prevalent, however) or simply resized window. My guess is it was really a bit after monitors started supporting huge resolutions. Other concerns are like supporting multiple styles, such as for printing or the color-impaired (I'd like to believe the latter is true -- if not, I think it'd be a fun research project). Ideas like graceful degradation make it even more fun than what we normally encounter in software.

Now we get to the debate. How do we make a website look like how we designed it in Photoshop? Tables! Done. Except, not really: it's somewhat ok in terms of category #2 of the why -- tables can describe our fancy Photoshop pictures (and let us achieve 3-column layouts without having to first attain zen) -- but I'm not content.

Let's poke at tables a bit. They leave a lot to desire, even in category 2. Expressing what you need with tables is still too hard once you realized we're not talking about a Photoshop raster image. If we admit we have adaptive layout concerns, I want richer constraints. What about category 1 of the why? Tables are still horrible to work with because they are very low-level, so they don't satisfy my in-house desires. More fundamentally, you really need to mess around with the document structure to get the right table layout. While tables will get better support in CSS3 (?), your data tree is still likely going to be all screwed up if you rely upon them; the current mess of dummy DIVs is a little more digestible (structurally) in practice.

My intuition? The web isn't just delivering documents nowadays, so we need richer layout languages. First, richer constraints etc. need to be reconsidered (though I'm not convinced by papers about just tacking on a linear constraint system) for the flow layout portion of styles. However, we need to also support alternative layout policies: not everything is a document and thus can benefit from alternatives like grid layouts (hinted by the desire for tables). XUL was Mozilla's position statement here. Imagine physics-based or 2.5d layouts -- we've put ourselves into a box with flow layouts. Finally, the issue of separation has been insufficiently solved by XSLT. I like the idea of having a data layer and a styling layer, and the ability to restructure the data representation to simplify the style constraint layer. Two thoughts here. First, CSS (selectors, at least) are popular and concise, yet XSLT feels heavy. Second, and, more fundamentally, in the age of dynamic pages and scraping, I suspect that the restructuring part should be a declarative shell around the DOM, just like CSS, creating a new view of the DOM. Scrapers could then chose to look at the DOM or the visual layout structure -- or, if done right, understand the bindings to both.

No comments: