The Good Soldier LMeyerov: January 2011

Thought I'd post about two interesting things I've been looking at recently: running task parallel code using SIMD instructions and artificial human languages.

SIMTask Parallelism
First, while trying to make CSS layout engines ridiculously fast through hardware-oriented optimizations, I had the basic question: well, if it has multicore-friendly parallelism, why not SIMD-friendly parallelism? I wasn't sure how practical the question was, but as an academic experiment, it's a very interesting one about the nature of parallelism in 'real' hard-to-parallelize software (as opposed to, say, rendering and dense matrices). Looking at our task parallel formulation of CSS layout, we can say it's a combination of two patterns: a task-parallel decomposition roughly following the HTML tree, and a visitor pattern for dispatching different tasks based on the type of each tree node (paragraph, link, etc.). How do we use SIMD instructions for that?

Often, for SIMD instructions, we want something like basic reductions (e.g., scan operations): taking the sum of a list, max of a list. We don't see that a lot in webpages: you'll sum up the children in a node, but then do a lot of computation within the node (margin, padding, border, etc.). Even better for SIMD instructions are maps: add 10 to each width, divide every height by 3, etc. That sounds more close to what we want... except the visitor pattern thing: the data being computed on with the same instructions needs to be adjacent, but, unfortunately, if we look at the tree, different functions are being run on adjacent nodes!

Enter SIMTask parallelism. We can do some data mining on the tree to rearrange it in memory so that the same type of node is next to others of the same type. For example, we can collapse the webpage into a prefix tree and use that to have a different layout in memory: now the same types of nodes are next to each other physically. Because of the task parallelism (with some minor compiler trickery), we can actually run the code in this same rearranged order! For example, in a shopping cart, where each shopping cart item is a complicated subtree, but the subtrees are the same across items (just changing what picture, what text, etc.), all of the link nodes will be next to each other in memory, as will text, as will paragraph formatting, etc. SIMD instructions can be then used!

Anyways, we just submitted a fun vision paper about this. If you take a step back, what we're really saying is that task parallel code often isn't just good for multicores: with smarter frameworks and compilers, we can hope for exploiting some regularity in the task structure to get better memory effects (e.g., from flattening) and even potentially scale down the parallelism to use SIMD evaluation (including better branch prediction)! This is a big departure from current lightweight task parallel frameworks like Cilk/TBB, that, outside of some stack representation trickery, are really about lightweight runtimes and not heavy yet powerful code transformations.

Finally, if you're familiar with modern GPU architecture, this may sound familiar. Modern GPUs use something called a SIMThread model, which is somewhere between vector and threaded ones. Threads get bundled, and for the convergent threads in a bundle (i.e., branching the same way), you get parallel/vector execution. SIMThread = Same Instruction, Multiple Threads. SIMTasks are like that, but more software managed: SIMTasks are to SIMThreads as threads are to SIMThreads. Likewise, vectors < SIMThreads < threads, and reductions/scans < SIMTasks < tasks. Doing SIMTasks introduces a pretty interesting world of optimization; I did some neat tricks with memory layouts and even found opportunities for memoization across SIMTasks that you wouldn't get in normal tasks (in CSS, at least). If you haven't picked up on it now, SIMTask = Same Instruction, Multiple Tasks.

Should be able to put the paper out in a couple months either way. I'd like to report on the full results of the CSS side of things, but we didn't really fit that into this writeup and am not sure when we can (a bit more work to do, can use more automation, and deserves its own writeup -- maybe if we fit it into the Berkeley browser we'll be able to). Story of my life :)

ConLangs: Constructed Languages
A little more fun, I was browsing a bookstore and found a book about the rise of inventing languages: In the Land of Invented Languages. However, with maybe one exception, these weren't about languages for instructing computers: these are about languages for communicating between humans. We all know Esparanto and Klingon, and maybe even Hildegard's and Tolkein's (Tolkein made the elves' language, including their derivation, before the books!), but there have been symbol languages adopted for the deaf, and, even earlier, language invention was almost a gentleman's pursuit.

One was particularly fun: Laadan, a language from a woman's perspective. It stole a bit from Navajo, but apparently also has some novelties predicting some modern security and HCI research: for the grammar, terms are affixed both with emotion and source of authority!

I was really interested in the book as part of my "socio-PLT" kick (sociological approach to programming language theory): how is adopting Klingon or Logjam different from adopting brainfuck or Haskell?

The Good Soldier LMeyerov

Thursday, January 27, 2011

SIMTask parallelism and ConLangs

Translate