Thursday, January 15, 2009


Today, I was going to try getting the Cilk++ port of one of our algorithms running, but apparently there are no (publicly?) available lex/yacc grammars for CSS, and the one in the specification was downright wrong. So, I'll try again tomorrow, and leave anyone else interested in parsing CSS with some nearly-compliant CSS 2.1 lex and yacc grammars:

(01/15/2009) Initial CSS 2.1 lex grammar and CSS 2.1 yacc grammar (flex/bison, really). The version suggested by the official specification is non-standard, incomplete, and ambiguous: this one is executable lex/yacc, finishes token definitions, and fixes the ambiguity with simple selectors. Caveat: I admit to writing it messily in only a few hours as a learning exercise :) It was tested on a few thousand lines of Slashdot's CSS files. Priorities (e.g., !important) and identifiers with "_" and "-" symbols should be modified if you care about them: I (non-compliantly) strip out priorities and accept illegal identifiers. For convenience, the lexer strips out whitespace, including when it is used as a delimiter to signify descendant relations: good luck making a yacc grammar that doesn't ;-)

Hopefully this saves others some pain.

Perhaps I'll also have time one day to write a tutorial on how to precisely think about and control floats (based on the CSS specification); probably be useful to way more people :) Crazy that I had to formalize and restructure the CSS specification in order to halfway understand it.

1 comment:

Noel said...

Crazy, but unsurprising.