Saturday, March 10, 2012

Languages for Bad Programmers

Bad programmers are linguistic victims and good programmers are masochists.

Brett Victor is going viral with an inspiring video on language and tool design (and philosophy). The language community has actually already made some strong strides in the directions he wants. Some of my favorite research is in tangible programming,  live coding, and omniscient debuggers.

Patrick Lam picked up on something else. As part of Brett Victor's philosophy that every code artifact should be directly (physically) manipulatable and understood, Brett believes requiring indirect manipulation and indirect understanding to be language design flaws. In particular, programming should not require mentally simulating an abstract Von Neumann machine that steps through your program. Not everyone agrees: Pat observed that, on the list of signs of being a bad programmer, the very first warning that you are a bad programmer if you cannot "run the program in your head".

Here's the thought. What would a language for bad programmers look like? Looking at the list of hacky "bad" programming practices, can you imagine a future where language and tool support will advance enough that they're the acceptable norm? The "bad" habits actually form an admirable set of goals for the language research community:

  • Warning sign: Cannot simulate the program in their head.
    Goal: Get the computer to do the simulation.
  • Warning sign: Have a poor understanding of the language's programming model.
    Goal: If I make basic grammatical mistakes when I speak, people correct me -- why doesn't the compiler?
  • Warning sign: Chronically poor knowledge of the platform's features. Symptoms: reinvent language and framework code, "Email me teh code, plz" messages on forums, etc.
    Goal: Instead of viewing code copying as a bug, it's a design goal: I suspect *most* code today is duplicated, so I look forward to the future where the emphasis is on searching and synthesizing code rather than learning and writing it.
  • Warning sign: Struggle to comprehend pointers. Symptoms: failure to implement a linked list, allocates arbitrarily big arrays, inability to find pointer arithmetic bugs, etc.
    Goal: raise the level of abstraction so programmers don't have to think about these issues.
  • Warning sign: Have difficulty seeing through recursion.
    Goal: Thinking concretely should be fine, and the tools should walk you through the base cases and generalization.
You can apply similar reasoning to the category of "mediocre programmers" and, my favorite, "pinball programmers". Pinball programmers struggle with edge cases, or simply do not care. They can be a security disaster. Again, however.... so what? These types of problems are exactly what we built computers to solve: rote grunt work.

The shocking conclusion of the bad programmer essay is the signs that you shouldn't be a programmer. If you cannot determine the order of program execution, cannot think abstractly, poorly organize your code as an ever-expanding ball of mud, believe the code and runtime are unreliable, or simply don't care about much of the code quality, you shouldn't be a programmer. Hogwash!  If these traits are required for programming, our ability to write code that keeps growing in ambition by magnitudes seems doomed. I can fairly neatly bucket these warning signs as inevitable programming realities (massive and concurrent code on unreliable hardware) and goals ("search, don't sort"; declare what instead of how).

Maybe the programmers aren't really all that bad. It sounds like today's languages are the stupid ones.


(Updated 1/26/2013 with minor editing.)

Wednesday, March 7, 2012

Loop synthesis: more declarative than functional programming?

As part of the FTL language, I've been playing with the idea of declarative loops. For example, I'd like to write:


for i in range(0, n):
b[i + 1] = g(a[i]);
a[i - 1] = f(b[i - 4]);


First, the synthesizer should figure out that the problem is under-constrained. It should report that the following values are undefined:


a[0..2] = ??
b[0] = ??
a[n] = ??


If, as suggested by the synthesizer, I refined the program to be consistent:


a[0..2] = 0
b[0] = 0
a[n] = 0
for i in range(0, n):
b[i + 1] = g(a[i]);
a[i - 1] = f(b[i - 4]);


I'd like to get:


a[0] = 0
a[1] = 0
a[2] = 0
b[0] = 0
b[1] = g(a[0])
b[2] = g(a[1])
b[3] = g(a[2])
for (int i = 3; i < n - 1; i++) {
a[i] = f(b[i - 3]);
b[i + 1] = g(a[i])
a[n] = 0


Note that the synthesizer performed loop peeling -- it figured out to do loop peeling to emit assignments for b[1], b[2], and b[3]. Furthermore, it reordered the assignments in the loop and performed loop skewing -- it changed the indices of the assignments, such as from a[i - 1] to a[i], to ensure an array value is written before it is read.

Milind and I just worked out most of the theory for this today. You can think of it as a generalization of attribute grammar synthesis to loop code. Basically, if we assume the input code is singly assigned -- array values should have one unique value -- this is a matter of unrolling the intermediate steps of the loop, scheduling their dependency graph, and reconstructing the Von Neumann-friendly loop. The weekend may be busy :)

A fun question is whether a functional program could declaratively handle the above. For example, it could expand the for loop into a dependency graph and then use demand-driven evaluation. However, there are two problems. First, the interpretation of indices would have to change: my code correctly skewed the loop indices to make sure "a[i - 1]" and "b[i - 4]" are never negative. Second, the loop would be unsafe: in my approach, the synthesizer can statically check that the array is fully defined. Either way, traditional functional languages force programmers to explicitly handle these (arguably) low-level loop details. Go synthesis!

Sunday, March 4, 2012

Modeling language experience report: synthesizing CSS's automatic table layout solver


Table rendered using an automatically synthesized layout solver.



Model-driven languages handle complexity in implementing a language. For example, after designing a language feature, you want it to work with the optimizer, parser, IDE, etc. As the language gets more complicated, such as we've seen with the increasing scope of the CSS layout language, the implementation burden gets so high that many seemingly small language extensions and tool improvements require understanding a lot of the eco-system. With modeling languages, we can use synthesizers and code generators to correctly write features, avoid duplication of work, and basically handle more code than you could manually. For example, we don't want to manually teach the Firebug debugger about CSS semantics nor, when parallelizing for multicore or SIMD hardware, do we want to manually look at every edge case in the layout specification.

We picked attribute grammars as the basic specification language for our parallel browser layout language work. However, this begs the question: are attribute grammars expressive enough to actually handle layout languages?** Previously, we wrote (and automatically parallelized) various visualizations, so we know attribute grammar can be used for
future languages. We're now going after the elephant in the room: CSS. As a fun start, Ras and I just put together an experience report about our successful experience with synthesizing an implementation of the 'automatic table layout' specification.

In short, it works. The table above was rendered using an automatically generated layout engine. Furthermore, our synthesizer found a multicore implementation for it. Neat :)

Stay tuned for more and we'll welcome any challenge problems :) Props to Ras, Matt, and Sam for help in revising that report!

**: More specifically, can we do this with statically scheduled attribute grammars? I'm distinguishing against 'demand-driven' and other approaches that essentially reduce to the Turing tar pit via general lazy functional programming.