The Good Soldier LMeyerov: February 2009

Saturday, February 21, 2009

Even more whitespace sensitive languages

Michael Tschantz, on Maude: "Man, I can't believe this thing lets you overload [white]space. I mean, that's the most warped thing I've seen since C."

We all know python is whitespace sensitive, as are some more esoteric languages. I've always been annoyed by inconsistent handling of whitespace in HTML by web browsers, which impacts the styling layers. More recently, I realized that CSS uses whitespace as a combinator as part of the rule selector language: "div + p.cool li" employs two predicate combinators, "+" and " ".

More fun? Wikis peano-encode whitespace, so one whitespace looks like a space, while two might mean a code block with special formatting.

Reddit, as part of their comment system, uses different semantics for peano-encodings of sequences of one, three, and four whitespaces, and possibly more. Apparently, not all of their users get it right, either.

Friday, February 20, 2009

Aha! moment

I was playing around with some math today, and realized you can reduce testing parallel programs to the birthday paradox. A fun implication is that getting a 20x increase in parallelism equates to achieving nine-nines of reliability when you slow back down. More to come, perhaps :)

Wednesday, February 18, 2009

The tubes are on fire!

Or, Eric Meyer just agreed that CSS just doesn't get the job done sometimes. Empirically, as somebody pointed out on reddit, we see 15 of the top 20 webpages (worldwide) use tables for layout, instead of relying upon typical CSS. After doing a whole lot of commercial web design back when the table-less layout religion first came about, and doing almost entirely web-related research for the past 4 years, why this is such a big deal is starting to make a lot more sense. Trying to write the core semantics of CSS and then parallelizing it for the past half year has made me uncomfortably intimate with the what and why, and colors how I think about alternatives.

First, what is CSS? Cascading Style Sheets. Intuitively, it's a language to describe how an HTML tree is styled. More technically, it's:

A rule language: a style sheet is a set of rules, where each rule might be triggered on different document nodes, and add style constraints to them (e.g., set the width to be 50% of the parent's)
A constraint language for specifying these constraints. I've been unable to bound its expressiveness (sorry, only tried for 1-2 evenings), but the important part is typical constraints are either for how a line of text appears or to achieve a flow-based layout. In a flow-based layout system, generally, one sibling shows up after the previous, and nesting is supported. In CSS, exceptions to this have steadily grown over time.

Judging it by these two traits, a better question is "why CSS?":

Separating style from data by using rules has at least two basic purposes:

Simplify in-house development: it facilitates making sweeping changes and uniform designs, which is important for styling any non-trivial site in a maintable way.
Simplifies analysis: whether a human is reading the source or a machine, cleaning up the clutter is a big deal. A search engine, or any other agent that must generically extract data from a site and use it in a different way, such as a screen reader for the visually impaired, can easily get tripped up nearby tags in the data tree. Having those style-related tags there, or even a non-data-related arrangement of your data tree, just for visual appearence, should be an easily avoidable reason to break this eco-system.

It describes what we need. A flow-based layout makes sense when you think about the majority of the web and historical context. It was made for laying out documents, character after character, word after word, sentence after sentence, paragraph after paragraph, etc. That's what, almost all of the web? Competing ideas around the time were things like geometric constraints that are more apropos for industrial tasks like modeling car parts -- vector tools on the web are a more modern pursuit.

There's really a bit more to it. For example, I'm not sure when it started -- I think a bit after the time there was sufficient CSS compatibility between browsers -- but there's been a push for a long time to twist CSS to describe fluid layouts: allow the layout to adapt to different screen size constraints, such as for handhelds (this was before they were prevalent, however) or simply resized window. My guess is it was really a bit after monitors started supporting huge resolutions. Other concerns are like supporting multiple styles, such as for printing or the color-impaired (I'd like to believe the latter is true -- if not, I think it'd be a fun research project). Ideas like graceful degradation make it even more fun than what we normally encounter in software.

Now we get to the debate. How do we make a website look like how we designed it in Photoshop? Tables! Done. Except, not really: it's somewhat ok in terms of category #2 of the why -- tables can describe our fancy Photoshop pictures (and let us achieve 3-column layouts without having to first attain zen) -- but I'm not content.

Let's poke at tables a bit. They leave a lot to desire, even in category 2. Expressing what you need with tables is still too hard once you realized we're not talking about a Photoshop raster image. If we admit we have adaptive layout concerns, I want richer constraints. What about category 1 of the why? Tables are still horrible to work with because they are very low-level, so they don't satisfy my in-house desires. More fundamentally, you really need to mess around with the document structure to get the right table layout. While tables will get better support in CSS3 (?), your data tree is still likely going to be all screwed up if you rely upon them; the current mess of dummy DIVs is a little more digestible (structurally) in practice.

My intuition? The web isn't just delivering documents nowadays, so we need richer layout languages. First, richer constraints etc. need to be reconsidered (though I'm not convinced by papers about just tacking on a linear constraint system) for the flow layout portion of styles. However, we need to also support alternative layout policies: not everything is a document and thus can benefit from alternatives like grid layouts (hinted by the desire for tables). XUL was Mozilla's position statement here. Imagine physics-based or 2.5d layouts -- we've put ourselves into a box with flow layouts. Finally, the issue of separation has been insufficiently solved by XSLT. I like the idea of having a data layer and a styling layer, and the ability to restructure the data representation to simplify the style constraint layer. Two thoughts here. First, CSS (selectors, at least) are popular and concise, yet XSLT feels heavy. Second, and, more fundamentally, in the age of dynamic pages and scraping, I suspect that the restructuring part should be a declarative shell around the DOM, just like CSS, creating a new view of the DOM. Scrapers could then chose to look at the DOM or the visual layout structure -- or, if done right, understand the bindings to both.

Tuesday, February 17, 2009

toy language

For a mini-assignment, we were supposed to pick a few problem types and sketch a language to run them in parallel. I picked unstructured grid, pipes, and events. The idea is to step down from FRP a bit to be more Esterel-like, and to add an explicit spawn call for events.

Some fun snippets:

Warmup:

var mypipe = <||> //a new pipe!
var mypipe2 = <mypipe + 1> //returns a new pipe
<mypipe> ! 20 //inject an event into the pipeline, updating mypipe and mypipe2
ticks(1) ||| \_. assert(mypipe + 1 == mypipe2)
assert(mypipe2 == 21)
assertException(\_.<mypipe> + 1)
assert(<mypipe> != <mypipe2>)

Bounded parallelism:

//input: var source =

var batches =
<source>.foldE(\(v,acc). if acc.length == 100: [v]
else: acc.push(v)
acc,
[])
.filterE(\a. a.length == 100)
function decrypt (v) v
var workers = [ batches ||| spawn()
||| map(\arr. decrypt(arr[x]))
for x in range(100) ]
workers.merge()

Data partitioning:

var graph = {v: <|1|>,
owner: root,
neighbors: [root,
{v: <|2|>, neighbors: [], owner: root},
{v: <|3|>, neighbors: [], owner: root}]}
partition p;
for n in graph:
var r = n.owner.neighbors.foldi(
\(val,i,acc). acc ? acc : (val == n ? i : acc))
n.__region = n.neighbors.__region = p[r]
for n in graph:
n.v’ = <|average(spawn(n.__partition)(
n.neighbors.map(\n. n.v)))|>
ticks(999) ||| \_. for n in root: <n.v> ! n.v’

Thursday, February 5, 2009

success!

Submitted! The insight: by extending the membrane pattern to also be a secure advice system, you can build a very effective policy system by injecting it into the membrane. This is a big boon to web browsers: you can make different systems to support mashups, inter-frame sharing, secure extensions, etc.

Normally, with just a membrane (assuming you've managed to implement one securely, which is tricky!), you can do the following:

var m = makeCanonicalMembrane(webpage);
broadcast(m.view);
...
m.disable(); //disable any access to the webpage or associated objects

That's ugly, so we might provide an enable/disable method for individual objects:

var m = makeCanonicalMembrane(webpage);
m.disable($('passwordField'));
broadcast(m.view);

That's closer to the traditional security standpoint: permit/deny policies. However, from a software engineering perspective... the only worse solution is the previous one. So, we introduce an advice system where users can plug in their own methods:

function encrypt (o, p) { return hash(o[p]); }
var m = makeAdviceMembrane(webpage);
m.advise($('passwordField'), {getters: {value: encrypt}});
broadcast(m.view);

But this is clunky -- you are literally advising every single little function call. However, at this point, you have provided a simple way to inject a policy system which knows how to manipulate the view. For example, you might want to encrypt every password field. We might make a policy system that supports selectors, and translates calls into lower-level advice:

var mp = makePolicyMembrane(makeAdviceMembrane(webpage));
mp.applyPolicy(
webpage,
{selector: "//input[@type='password']",
enabled: false});
broadcast(mp.view);

This is really a new way of secure programming. Instead of sharing an object, you share a customizable and secure view of the object, and automatically generate a customized one by providing a declarative security policy. There are a lot of directions to take this. For example, I'm playing with the ability to hide DOM nodes, not just disable them. This is tantamount to taking an element in a linked list and, in a view of that list, have all previous and next pointers skip over it. Fun with bidirectional programming commences once you support mutation. Furthermore, I can likely hook this into our old Margrave verification system!

Anyways, life is always more fun with pictures. This stuff actually works with bubblemark, where we modify every ball to actually be a view of one! The horizontal bar is where you jump from running a smooth animation to a jerky one.

Tuesday, February 3, 2009

No language is an island

Firefox 3.0.5:

try { document.write('error: ' + e); }
catch (ee) {
document.write('error giving an error: ' + ee);
}
==> error giving an error: TypeError: can't convert e to primitive type

Back to work..

Monday, February 2, 2009

a new toy

Success!

test(function () {
resetLog();
var o = {x: {y: function () {}}};
var m = makeMembrane(o);
m.setPolicy(o.x.y, {caller: logCall});
m.view.x.y();
return wasLogged();
});

==> true

Problem: securely share an arbitrary object with different parties with varying levels of trust

Phase 1 of the solution: a functional, bidirectional advice system

lazily & deeply copy an object and proxy calls from the view to the model

subtlety: prevent references from the model to leak to the view and vice-versa; consumers of the view have a different notion of equality than those of the model

allow users of the model to set advice for actions performed by consumers of a view

subtlety: advice applies only to the view. membrane owners may separately and securely control the associated view.

subtlety: advice acts exactly at the divide between the model and view, but must still be protected

For those familiar with them, this is similar to a popular idea with object capabilities -- except adding advice is the first step towards usability and verification.

Phase 2: to be described in a much later post -- how to realistically get usability and verification out of this.

Phase 3: profit