There are essentially three purposes for language semantics. Overall, as a smell test to guide language design: if you can't define a feature, it's probably bad. Second, to help other implementers of our language and programmers for the language -- unfortunately, I'd claim we generally fail on this point. Third, to help structure proofs about our language.
As part of the preparation for a paper submission, I'm finishing up my formalization of a subset of CSS 2.1 (blocks, inlines, inline-blocks, and floats) from last year. My first two, direct formalization approaches failed the smell test so Ras and I created a more orthogonal kernel language. It's small, and as the CSS spec is a scattered hodge-podge of prose and visual examples riddled with ambiguities, we phrase it as a total and deterministic attribute grammar that is easy to evaluate in parallel. Finally, to prove that it can be implemented efficiently (e.g., linear in the number of elements on a page, meaning no reflows), the grammar without floats leads to a syntactic proof, and the version with floats then only has to explain away some edge cases (using a single-assignment invariant, which can probably also be made syntactic).
All of this should have happened 10 years ago. However, the academic language design community, as per the norm, seems to have been late to the party. Instead, we have huge, slow interpreters that don't give the same answers and a generation of abused and confused artists and designers.
Monday, June 29, 2009
Friday, June 26, 2009
An evil test case
Opera, WebKit, and Firefox are supposed to conform to CSS 2.1 standards, at least, right?
A fun test:
Do three versions of it: set the red box to be display block, inline-block, and inline. A little subtle to why this breaks is that, while these browsers might conform to the CSS specification for these (I'm still not sure about this), it exercises ambiguously defined parts of the spec.
For the few folks actually implementing this stuff, it exercises ambiguity in the definition of preferred widths, preferred minimum widths, and shrink-to-fit in the presence of floats.
Formalizing one fully-defined interpretation of the spec is... interesting. Thought I was done, but realized I wasn't =/
A fun test:
<div style="width: 30px; height: 30px; background-color: blue; padding: 1px">
<div style="display: block; background-color: red; padding: 1px">
<div style="opacity: .1; float: left; width: 100px; height: 40px; background-color: green">a</div>
<div style="opacity: .2; float: left; width: 100px; height: 40px; background-color: green">b</div>
<div style="opacity: .3; float: left; width: 100px; height: 40px; background-color: green">c</div>
<div style="width: 50px; height: 10px; background-color: green">d</div>
</div>
</div>
Do three versions of it: set the red box to be display block, inline-block, and inline. A little subtle to why this breaks is that, while these browsers might conform to the CSS specification for these (I'm still not sure about this), it exercises ambiguously defined parts of the spec.
For the few folks actually implementing this stuff, it exercises ambiguity in the definition of preferred widths, preferred minimum widths, and shrink-to-fit in the presence of floats.
Formalizing one fully-defined interpretation of the spec is... interesting. Thought I was done, but realized I wasn't =/
Friday, June 12, 2009
First weekend in Seattle
Still looking for a place, but Capitol Hill successfully captured my heart:
I also somehow came across two free passes to the Seattle International Film Festival that is ending this weekend. Viable candidates (will see one each on both remaining days; haven't found anyone else who is interested yet):
300+ movies over one month... that's an intense festival. Despite the lack of an apartment and a slowdown on my research, summer is going well (and I apparently had a subletter for my place in Berkeley without realizing it!)
- Lunch: jamon serrano crepe @ joe bar cafe
- Snack: picked up a copy of Pride and Prejudice and Zombies and headed off to another cafe
- Discovered Dilletante, a chocolate cafe and chocolate martini bar -- opens at 4pm, so on my TODO list
I also somehow came across two free passes to the Seattle International Film Festival that is ending this weekend. Viable candidates (will see one each on both remaining days; haven't found anyone else who is interested yet):
- Saturday:
In Your Absence- Breathless (it was great!)
Fifty Dead MenFlame & CitronForever Enthralled
- Sunday:
Marcello Marcello- The Shaft (done well)
InvoluntaryOSS 117: Lost in Rio (+ closing gala)The Overbrook BrothersA Pain in the Ass
300+ movies over one month... that's an intense festival. Despite the lack of an apartment and a slowdown on my research, summer is going well (and I apparently had a subletter for my place in Berkeley without realizing it!)
Thursday, June 11, 2009
RazorFish
I think the most significant demonstration was somewhere between 5:00 and 5:30. Would have loved to have hacked on this in high school :)
Sunday, June 7, 2009
Security as a Disease
Lovely post by Rob Meijer on the capability list about how to write a useful Wikipedia article (in particular, for cleaning up the entry on ambient authority):
A good way for me personally to think about the audience for infosec related subjects, is to think about myself on medical related subjects.That totally made my morning. Worth reading.
...
In this case, the main things you would want to know would be:
* How do I know if I have ambientitus?
If I know I don't have ambientitus, I'm out, off to find other possible sources of my symptom. If it is likely that I have ambientitus, I would want to know:
* How bad is it? Is it fatal?
...
Friday, June 5, 2009
updateable secure views
Was looking at what the bidirectional programming folks at UPenn were up to, and got excited: the paper we had rejected last year about secure browser programming through views was, in another form, accepted for these guys: Updateable Security Views. Our take on it (a tentative title was "membranes, views, and browsers") was of capabilities and flexibility; this clearly has different principles (static checking, information flow, etc.). The important thing is that the idea is gaining traction!
Thursday, June 4, 2009
Assembly Language of the Web
What should be the assembly language of the web? JavaScript has risen as the pragmatic choice, with the Flash VM trailing far behind. Some elements of Google think x86 is the way to go.
With Flapjax, JS was a fine choice: we were interested in rewiring existing AJAX-style apps and could do it. Sometimes, with both Flapjax and the membrane work, it wasn't enough (weak references, reflection / interpositioning woes, no good separability). Other notions, like parallelism and DB transactions, are just missing entirely (e.g., workers were not enough when I did some Firefox extension work).
Given a void, what's the way forward? CLR? DLR? JS? x86? VMs? Folks like Erik Meijer and Gilad Bracha seem comfortable with JS is the new assembly, but I'm not. At best, it would then need an overhaul that it isn't getting, and, at worst, we're slowing the web down by maybe 5-10 years.
With Flapjax, JS was a fine choice: we were interested in rewiring existing AJAX-style apps and could do it. Sometimes, with both Flapjax and the membrane work, it wasn't enough (weak references, reflection / interpositioning woes, no good separability). Other notions, like parallelism and DB transactions, are just missing entirely (e.g., workers were not enough when I did some Firefox extension work).
Given a void, what's the way forward? CLR? DLR? JS? x86? VMs? Folks like Erik Meijer and Gilad Bracha seem comfortable with JS is the new assembly, but I'm not. At best, it would then need an overhaul that it isn't getting, and, at worst, we're slowing the web down by maybe 5-10 years.
Wednesday, May 27, 2009
Exciting times!
Thought I'd post a quick status update. I will not actually be here this summer!
1. Browser stuff. Over the next month, I'll be bouncing around and hopefully finishing the initial version of my parallel web page layout algorithms. In the fall, I want to make sure it's all stitched together and then might switch into thinking about adaptivity or, even more general, parallel scripting.
2. Webpage model extraction / exploration stuff. After the browser work reaches a good state (PPoPP?), I'll be rewriting and scaling out our blackbox analyzer and will make it directed. If collaboration works out, there'll be some interesting twists (either a new type of analysis or integrating and expanding some earlier whitebox ideas)
3. Summer! Something mysterious at Microsoft Research about browser security. I'm guessing/hoping a principled clean-slate approach or some program analysis.
One of many flights start tomorrow.
1. Browser stuff. Over the next month, I'll be bouncing around and hopefully finishing the initial version of my parallel web page layout algorithms. In the fall, I want to make sure it's all stitched together and then might switch into thinking about adaptivity or, even more general, parallel scripting.
2. Webpage model extraction / exploration stuff. After the browser work reaches a good state (PPoPP?), I'll be rewriting and scaling out our blackbox analyzer and will make it directed. If collaboration works out, there'll be some interesting twists (either a new type of analysis or integrating and expanding some earlier whitebox ideas)
3. Summer! Something mysterious at Microsoft Research about browser security. I'm guessing/hoping a principled clean-slate approach or some program analysis.
One of many flights start tomorrow.
Monday, May 25, 2009
mixing thread-aware and thread-agnostic code
For almost all of the algorithms I've been playing with for the parallel browser, Cilk-style parallelism matches. My development pattern is to do a sequential version, do a Cilk++ sketch, and then, for final tweaking, convert to TBB. (... and a lot of iteration involving hawkish monitoring of KCacheGrind statistics). However, invariably, something always goes wrong.
This week, it's using task-parallelism with a multi-threaded library. Task parallelism gets you away from the notion of a thread: whenever you have a unit of work, you just spawn it off, and thus may have many tasks for only a few processors. With threads, assuming you're CPU bound, you have as many threads as processors. FreeType2 is written for threaded use: each thread gets its own Library. However, task parallel usage (I'm rendering a bunch of glyphs: turning one character into a pixel can be thought of as a task) doesn't map nicely -- if I were to create a Library per task, I'd have to create thousands of Libraries instead of, say, 8.
The naive solution is to set up a resource pool: a task asks the pool for a Library when it starts, and returns it when it finishes. If there is no Library available, it gets created. If tasks are really small (e.g., individual characters, as opposed to, say, words), there'll be a lot of chatter when trying to get these Libraries (and, even if not, locks waste cycles, which is still a penalty proportional to task size).
TBB, because it is a library level solution, has actual task objects (Cilk should just puts some sort of continuation mark on the stack) and therefore faces the same problem all the time. It provides conveniences for reusing task objects (think of it like a manual TCO or trampoline). When reusing a task, a good habit can be to reuse data within it. In this case, when a task completes, it passes off its Library object to the next one that gets/becomes the reified task object.
Unfortunately, I don't think the code will work out that well. In reality, there's a hierarchy of resources (Library -> {Font}, Font -> {Glyph}). It'll work, but the impedance mismatch will cause some slowdowns.
This week, it's using task-parallelism with a multi-threaded library. Task parallelism gets you away from the notion of a thread: whenever you have a unit of work, you just spawn it off, and thus may have many tasks for only a few processors. With threads, assuming you're CPU bound, you have as many threads as processors. FreeType2 is written for threaded use: each thread gets its own Library. However, task parallel usage (I'm rendering a bunch of glyphs: turning one character into a pixel can be thought of as a task) doesn't map nicely -- if I were to create a Library per task, I'd have to create thousands of Libraries instead of, say, 8.
The naive solution is to set up a resource pool: a task asks the pool for a Library when it starts, and returns it when it finishes. If there is no Library available, it gets created. If tasks are really small (e.g., individual characters, as opposed to, say, words), there'll be a lot of chatter when trying to get these Libraries (and, even if not, locks waste cycles, which is still a penalty proportional to task size).
TBB, because it is a library level solution, has actual task objects (Cilk should just puts some sort of continuation mark on the stack) and therefore faces the same problem all the time. It provides conveniences for reusing task objects (think of it like a manual TCO or trampoline). When reusing a task, a good habit can be to reuse data within it. In this case, when a task completes, it passes off its Library object to the next one that gets/becomes the reified task object.
Unfortunately, I don't think the code will work out that well. In reality, there's a hierarchy of resources (Library -> {Font}, Font -> {Glyph}). It'll work, but the impedance mismatch will cause some slowdowns.
Wednesday, May 20, 2009
collaborative security
Was watching a video of Aza Raskin and, around 18:00, I got excited. Can we treat security as a people problem?
I've been mulling about this both in my work in overcoming data silos and in extracting models of applications. In the former, the user might want to add extra security to an app like google calendar, say by doing special permissions on for a particular event or even encrypting data before Google sees it, and, in the model extraction, I'd like users to pool their models together to collaboratively get bigger ones -- but I don't want stuff like bank account info to leak over. This latter problem occurs slightly differently in some of my work in mashup security: can we trust an extension to translate a webpage, but not, say, leak a bank account number?
Everyone, including Aza, bashed on the UAC: we can't just pepper users with dialog boxes. We really want things like blacklists That Just Work. Aza asks, just as we might trust a smart nephew to buy us a computer, might we trust one to figure out security for us? In the absence of a smart nephew, can we learn the security policy? What do cautious people normally say to a dialog box? Is there a bit of information on a page that users generally mark as privileged?
In three of my projects so far, I've found cases where I didn't think the application writer could a priori determine the appropriate action, yet doubt that the casual web user can either. What would it mean to build a browser or application extension that outsources security?
I've been mulling about this both in my work in overcoming data silos and in extracting models of applications. In the former, the user might want to add extra security to an app like google calendar, say by doing special permissions on for a particular event or even encrypting data before Google sees it, and, in the model extraction, I'd like users to pool their models together to collaboratively get bigger ones -- but I don't want stuff like bank account info to leak over. This latter problem occurs slightly differently in some of my work in mashup security: can we trust an extension to translate a webpage, but not, say, leak a bank account number?
Everyone, including Aza, bashed on the UAC: we can't just pepper users with dialog boxes. We really want things like blacklists That Just Work. Aza asks, just as we might trust a smart nephew to buy us a computer, might we trust one to figure out security for us? In the absence of a smart nephew, can we learn the security policy? What do cautious people normally say to a dialog box? Is there a bit of information on a page that users generally mark as privileged?
In three of my projects so far, I've found cases where I didn't think the application writer could a priori determine the appropriate action, yet doubt that the casual web user can either. What would it mean to build a browser or application extension that outsources security?
Subscribe to:
Posts (Atom)