Saturday, December 22, 2007

Why I Live in the Bay Area

Special emphasis on live.

QUINTESSENCE Five elements. One fantastic journey.

"The show is a spectacular feast for the senses that combines modern dancing, Russian folk music, and clownery that together create a truly unique theatrical experience."

Quintessence features the combined efforts of several renowned performing troupes:
• World famous clown-mime theater Licedei,
• Acclaimed Russian ambient folk singers Ivan Kupala,
• Pioneering folk-modern troupe Firebird Dance Theater,
California's home grown Infamous Siberian surf rock band Red Elvises



I saw a voicemail from Ilya (geek in residence @ Lucas Arts now) about this 15 minutes ago, and, starting 5 minutes ago, I'm a proud ticket holder.

As an alternative list of why I like my life style here, my tentative winter break plans:
  1. Take another stab at dependent type theory
  2. Work on my guided transparency + imperative FRP paper
  3. New Years in NYC
  4. Fix my structural type-feedback JS analysis
  5. ParLab retreat + poster, start thinking about frp, concurrency, asynchrony, & parallelism
  6. Check out the last day of POPL
  7. Either muck with Haskell or Erlang, attempt a JS concolic tester, or automatically find a particular class of security flaws in Firefox
  8. Start reading papers for prelims (reattempt PI-calculus, Hoare logic, & abstract interpretation?)


<BEGINRANT>
I'm reviewing case studies by our compiler students of using a simplified variant of FRP for a project - very interesting reading. I rarely see PL papers that involve user studies, so this is an interesting experience. Knowing that a <pick-favorite-language>-genius can write a fully featured, verifiable, horizontally scaling web app in <small-rational>K lines of code in a language/framework they designed doesn't help me much. However, what would? Structuring a user study, getting users, and then getting meaningful results, is hard. If the interest is in getting feedback from 'average joe' programmers, not expert programming language enthusiasts, then, in general, I should only expect quantitative, not qualitative, statements from users. From these, I must also guide them in what to answer, and this implicitly already biases them in their answers. For example, I'm interested in how many errors are due to the distinction between discrete event streams and continuously valued behaviours - yet users may incorrectly attribute these because they don't know better. One difficult but effective approach I've recently seen performed by Dan Grossman for his type error finding paper was to actually instrument a compiler to record user actions, and thus he was able to do a detailed postmortem analysis to achieve a good categorization. A time suck, but I believe going this extra mile is part of what separates the scientists from the mathematicians in programming language theory. At least in the conferences proceedings and journals I tend to read, the latter far outnumber the former, and I see it as a field of artists with similar taste ("schools of thought"), not actual scientists.

At this point in time, I feel we need way more of the former than we currently have, despite the freedom of the latter and their tendency to focus on foundations (which I believe is still important). Even in good old program analysis, where one can compare the results of one's tool against others, it seems most papers make a token, biased effort - Lhotak's "Comparing Call Graphs" demonstrated to me recently again just how susceptible analyses are to environmental conditions, stressing the pains one ought to go to in preparing fair comparisons. The lack of such care in many others papers in supposedly 'top tier' conference papers embarrasses me as someone trying to enter the field. Whenever I review a paper and see someone include this sort of result (or neglects it), I am always tempted to brush it aside as no new technique is presented, though when I step back, this information may very well be much more useful to others than whatever technique is actually being presented, as it helps guide future focus. </ENDRANT>

4 comments:

Michael Greenberg said...

I've felt the same way about evaluation of technical results for a while now. Don't we have logs from Flapjax? Isn't some perky undergrad itching to change computer science by interpreting them?

Michael Kearns gave a talk as part of a (mandatory) seminar series. It was kind of neat, but I still felt like all of these simulations and tests felt like, well, playing around. Where's the meaningful result? I guess I'm more (shoddy) mathematician than scientist.

As for abstract interpretation, Cousot & Cousot's 1976 paper is an easy to read classic; I implemented it this semester. The conventional reinterpretation into a dataflow problem is just an optimization.

T-4 days to SF -- who's excited?

lmeyerov said...

"logs from..."

I think Greg disabled those (it may have been on web services, don't recall), but, either way, I also believe the compiled language to be too restrictive (read: pure) to get surprising results. That was sufficiently clear from the class I TA'd.

"Michael Kearns..."

I couldn't find the cited talk (internal only presentation?). Anyways, I shouldn't have implied mathematicians are shoddy (?), but, just as math is not a science, a mathematical approach to PLT is not scientific. However, with extra work, we can often do both. We implement systems to learn from them. If you bother to implement something, there's a good chance there are lessons to be distilled, and more so when an implementation is polished and used by others. However, in the current academic system, physical impetuses for such approaches seem somewhat lacking. As for meaningful result... well, that's sort of the point - is your claim there are none? I've long wondered what makes a good user study for a modern language or feature. I've read a bunch of CS and Cog. Sci. papers about languages and communication - but I haven't seen many objective takes on what makes a 'good' one, nor many equilibriums.

HCI folk seems generally inappropriate for showing how our results apply to physical systems. Maybe that's a hint: programming languages aren't ideal for actual interfaces, and the community mostly gave up on us. (Not to insult Brad Myers and all the other innovators in the area). This may be true of end-users, but I still see a software crisis (well, several) around me, and don't believe that is a sufficient approach.

Thanks for the paper pointer. A couple of people also pointed me towards Cousot & Cousot's paper (seems like part of every prelim list) this semester. However, some people in the field (don't want to name names...) were not too happy with most modern introductions to AI, so if you've encountered good follow-ups... :) A couple of recent papers (probabilistic AI, some others) magically appeared in my bag as I left town, so I'm curious.

Enjoy New Years in SF -- let me know if you want any restaurant recommendations. The city does fireworks in stereo for July 4th, so they may just do it for New Years too :)

Michael Greenberg said...

The presentation doesn't seem to be up, and his publication list looks interesting. He talked about simulations he ran in distributed computation with human actors. Apparently humans are quite good at solving very hard problems, e.g., graph coloring, but not very good at solving trivial problems, e.g., "consensus" (read: color a graph all one color). It was all very suggestive and interesting, but he wasn't able to say anything quantitative about it. I think the work is in fairly preliminary stages. His work is outside PL, anyway; I'm not sure there's much we can learn from it.

The shoddy was self-deprecating -- there are lots of great mathematicians in computer science, I just don't feel like one of them.

lmeyerov said...

Sounds sort of like a meta approach to Louis von Ahn's work (http://www.cs.cmu.edu/~biglou/) - captchas, mechanical turk (I think), google image matching, etc social/human computation approaches.

Data driven approaches are showing relative success in graphics and search - the neural nets guys are always talking about the types of computation they believe should work in their model, so it sounds like Kearns has a refreshing take on this classification of data generating capabilities.

SK always joked that we'll never get engineers to write LTL specs, but I think even at that level, for example, we can ask what, empirically, is an accessible presentation of temporal logic. We know people often think sequentially, so I don't see temporal logic as being an inherently inaccessible thing, or at least all useful fragments of it. He's pitching a neat project (don't want to spill the beans fully here..) based on automatically extracting specs from some natural text right now, and I suspect one of the more interesting results (assuming success) will be the characterization of logical statements people make. I remember Vicky Weismann dabbled in this more manually before with limited results.