Monday, December 14, 2009

When should something be a language feature?

Adrienne and I had been hacking on a JavaScript library for controlling capabilities (see previously posted snippets). We have some semblance of an understanding of its pros and cons, both in abstractions provided and strategy used to implement them. My suspicion is that it belongs as a language feature. That's a big claim.

I described our library to a friend who does a lot of security research. He does more systems and dynamic analysis techniques than linguistic ones so he didn't have intuition for my claim. More precisely, I think he fell under the blub paradox: he got along fine without it, and can hack together something if he really needs it, so why bother with the extra language complexity? When something should be a language feature is a tough but stimulating question, and I think some of my reasons for this project are representative:

  • Performance. Bill Buxton, in his "order of magnitude" principle, states changing a dimension of something by a magnitude makes it a new thing. In this case, if a security feature is cheap, you feel fine using it -- imagine how we'd write C code if processes and message passing were as threads with shared memory! Using language support, we can do some dirty tricks. Lightweight threads in functional languages is essentially this.
  • Conciseness. Similarly, if we struggle to write something, we won't. There's a world of a difference in writing functional code in C++ and Haskell or even O'Caml. If you have to write C++ code, it's possible to write most functional stuff with some encoding and library help, but if you want to write functional style code, you'll be more productive in other languages. This is the OOM principle all over again.
  • Expressiveness. At a bigger scale, abstractions like continuations and aspects are fundamentally difficult to encode locally -- shoe-horning them into legacy code is tough.
  • Legibility. Most of our time is spent reading, analyzing, testing, and revisiting code: removing boilerplate and standardizing idioms strengthens large and long-term projects. Data binding -- even if not at the Flapjax level -- is a crowd favorite here. SQL is probably another good example (... until it falls down).
  • Automation. Spread through is a notion of global or standardized idioms: manually handling them increases the risk of error. My favorite example is manual garbage collection.
  • Tool support. If a feature is important, we'll probably want program analysis help (verification, testing, etc.) and IDE support for it -- which are easier to do when a feature is made a language feature. Package/import handling in Eclipse for Java apps is just the tip of the iceberg here.
  • TCB. The trusted computing base should be as small as possible: if a security critical property requires a lot of funny code and usage assumptions... that's bad. The whole multi-process browser architecture movement is, in a sense, about this (... though it's not the only way once you think about it like this).
  • Finally... Understanding. The approach of making a calculus based around a feature is in part due to this: what's going on at a basic level and how necessary is it? What happens if we deeply embed it -- what's the value and impact? For a research project, making an idiom a language feature is an enlightening approach, even if it'll be watered down to a library later. I find this somewhat similar to denotational and typed approaches to coding.
The points above are about the value-added benefit of doing something at the language layer.. I think there's a sniff test that would have a bit more to it. E.g., hovering around most of these reasons is some sort of notion of global or pervasive property. Something a developer is always doing or needing to somehow maintain.

Why something should be a language feature is a subtle question -- I still don't really know. Another perspective is to ask when something should not be a language feature. Where is the line?

No comments: