The Good Soldier LMeyerov: April 2011

Monday, April 25, 2011

New Berkeley Prelim Syllabus

I volunteered to help rethink the language design portion of the Berkeley prelim syllabus this week. Increasing the challenge is that we're cutting down from 60+ articles from before to only 36. We're shooting for about 6 core technical papers, and hoping there will be overlap with other areas (e.g., semantics, type theory, verification, optimization).

For my first pass, I've been thinking:

macros
continuations: on implementing prolog in functional programming
meta object protocol + messages: mirrors
type-directed programming: more types for nested data parallel programming

Any ideas for particulars or topic areas? For macros, I had been considering Shriram's swine before perl or Adapting Object-Oriented Frameworks to FRP. There are also some lightweight but useful papers, such as Hoare's Hints on Programming-Language Design. I couldn't think of anything sufficiently motivated for laziness (e.g., stream processing is pretty weak and equational reasoning is iffy).

Fun exercise :)

I'll try to post soon about some of the fun PL design / analysis / optimization work I've been doing. We just finished a camera ready for a workshop but I won't be writing up any of my bigger papers (SIMD tree layouts, multicore scheduling, CSS semantics / layout engine synthesis, and adoption-oriented languages) until later this year. Need to kickstart the writing process :)

Saturday, April 2, 2011

Topological Sort in Datalog

As a neat trick, I've been spending the weekend rewriting our hundrends (thousands?) of lines of attribute grammar visit scheduling from Java to maybe 100 lines of Datalog. This isn't so much of a surprise: there's been a bunch of work showing how to declaratively describe program analyses with datalog (and optimally optimize it with BDDs, though those results are, unfortunately, more contentious).

Anyways, the core of an attribute grammar scheduler is a topological sort of dependencies. I found this difficult to write. Originally, I tried using a successor relation, where each step was the set of edges (or nodes) that are reachable from the previous. No go, and searching on the web, I didn't see anything else. So, here's another tact:

Given the data:

edge('aa', 'b').

edge('a', 'b').
edge('b', 'c').
edge('c', 'd').

edge('d', 'y').
edge('d', 'z').

edge('a', 'm').
edge('m', 'n').
edge('n', 'o').

edge('o', 'y').
edge('o', 'z').

node(?n) :- edge(?n, ?m).
node(?n) :- edge(?m, ?n).

We calculate the flows-to relation, and then, for each edge, propagate 1 + the label of the previous edge. Finally, we take the maximal incoming label for each node, and voila!

flows(?n, ?m) :- edge(?n, ?m).
flows(?n, ?m) :- flows(?n, ?x), flows(?x, ?m).
whenAll(?n, 0) :- node(?n), !edge(?m, ?n).
whenAll(?n, ?c) :- whenAll(?m, ?d), flows(?m, ?n), ?d + 1 = ?c.
when(?n, ?c) :- whenAll(?n, ?c), ?c + 1 = ?d, !whenAll(?n, ?d).

Testing the result:

?- when(?n, ?c).
Query: ?- when(?n, ?c). ==>> 10 rows in 0ms
Variables: ?n, ?c
('aa', 0)
('a', 0)
('b', 1)
('m', 1)
('c', 2)
('n', 2)
('d', 3)
('o', 3)
('y', 4)
('z', 4)