Thursday, June 4, 2009

Assembly Language of the Web

What should be the assembly language of the web? JavaScript has risen as the pragmatic choice, with the Flash VM trailing far behind. Some elements of Google think x86 is the way to go.

With Flapjax, JS was a fine choice: we were interested in rewiring existing AJAX-style apps and could do it. Sometimes, with both Flapjax and the membrane work, it wasn't enough (weak references, reflection / interpositioning woes, no good separability). Other notions, like parallelism and DB transactions, are just missing entirely (e.g., workers were not enough when I did some Firefox extension work).

Given a void, what's the way forward? CLR? DLR? JS? x86? VMs? Folks like Erik Meijer and Gilad Bracha seem comfortable with JS is the new assembly, but I'm not. At best, it would then need an overhaul that it isn't getting, and, at worst, we're slowing the web down by maybe 5-10 years.

7 comments:

Gilad Bracha said...

Hi,

I agree that JS has major holes - weak references, dynamic access control, does-not-understand, secure access to the stack all come to mind.

I am sure they will gradually be addressed - probably in a very disappointing way, but enough to work around and build on.

We already can see a persistence model - something that was missing 5 years ago when I started thinking about this. It takes a LONG time.

Nevertheless, I don't see anything else that is so strongly positioned.

lmeyerov said...

Not sure what you meant by some of those:

By dynamic access control, do you mean anything beyond an ocap secure language? [I'm @ MSR working on browser & interp support for this right now, so am curious :)] I think not being able to specify a new principle has been a huge weakness in the web model -- requiring a new url viz. the single origin policy and then value-passing between different origin frames is too unnatural to be used by typical programmers.

What does-not-understand mean -- Lua-style undefined method dispatch, or perhaps the insanity of undefined variable introduction and coercion?

Finally, for secure access to the stack -- tying AC to the stack (java style) or more but safe primitives for manipulating the stack?

The situation sucks, but I've been watching the effort for a few years now, and it seems rather amenable to simple but principled suggestions (and implementing it in 1-2 of the browsers is enough to get major momentum, which isn't actually too hard).

Gilad Bracha said...

Hi Leo,

To answer your questions:

Access control. I happen to think that being able to restrict access to members of objects is very important. Yes, you can hide stuff in closures, but it is too inconvenient.

doesNotUnderstand - I should check on Lua's definition, but probably yes. See
http://gbracha.blogspot.com/2009/04/need-for-more-lack-of-understanding.html

Secure access to the stack: No, I most certainly do not want java style access control. I do want to be able to reflectively gain access to the stack, via capabilities. I believe it is a natural primitive that allows you to build debuggers, add continuations (delimited or not) etc.

lmeyerov said...

For #1, something like accessors (setters and getters -- now in most versions of JavaScript!) let you do what you want, I believe, for object methods and fields. We implemented a membrane system for the DOM using these that isolates objects between principles and then lets a principle specify progammatic policies (such as for making ACLs) for how an object is shared with a particular principle.

For the second, I've been tempted to agree. To integrate it into the prototype system, the notFound handler should be able to propagate the message (if desired).

For the third, I don't think reifying the stack is a good idea. For example, TCO implementations won't happen. Instead, something like an aspect system would let you pull out function call sequences or contextually restrict calls. I've actually been working on extending browsers and JavaScript for just that. Interestingly, in terms of the first notion, accessors for object fields and methods, our approach completes the pictures: JavaScript really has two types, objects and functions, so advice is the accessor equivalent for functions (though typically with a cleaner pointcut language than using arbitrary JavaScript to get a bunch of references).

Gilad Bracha said...

Quick response regarding tail calls optimization. I don't think that reifying stack access and TCO are necessarily contradictory.

It may restrict when TCO is applied - but unrestricted TCO is a problem for development environments anyway, so that doesn't bother me.

Conversely, what the "calling frame" is may not be what you expect if TCO is applied, and in some cases thats ok (this more like continuations).

In any case, it won't surprise me if people disagree. I'm used to it.

lmeyerov said...

My point is that I view the stack as an almost arbitrary implementation artifact. Something like aspects let's you reason about the semantic objects of interest -- call chains. Perhaps a better example would be securing a sequence of asynchronous calls (e.g., ".innerHTML" is only called between jquery calls, even if the event loop is used to split them up): the stack would have been destroyed between the async calls, so aspects would get the context while stack inspection wouldn't.

There might be a pragmatic argument.. but I'm still at the point where I want to know what the feature should do in this space before "how" and "how efficiently" :)

Gilad Bracha said...

I agree the stack is an implementation artifact. We can imagine that the call chain is heap allocated. TCO is just GC, if you define things right. This is what the Newspeak spec says (to take a random example :-)).

Now, I'm very concerned that one doesn't require heap allocated activations. You can imagine compacting the stack during GC to eliminate frames that should not exist under TCO. That way, you can run tail recursive programs in development but get still get good debugging for example.

Returning to JS and what it lacks: a capability that allows you to get your calling activation is a good primitive mechanism. It allows for both security and the power one needs to implement debuggers, pickle threads, or whatever you might want to do with continuations (delimited or not).