1. Try putting this into your URL bar:
javascript:alert(true + true + true)
Why does this still happen in modern languages? On a somewhat related note, Taras and Dave Mandelin came by to talk about their static analysis (and code transformation?) framework specialized for the multimillion line Mozilla codebase. While many of the problems they are trying to solve only exist because Mozilla is based on an unsafe language, two parts of their approach were still neat: reusing the compiler framework to get around the ugliness of the language, which 3d party tools will mess up in practice, and exposing data through a higher level language. In their case, they use JavaScript, but, as an analogy, DTrace provides a SQL-like language for handling probe data. Most queries are variations of data flow analyses with varying sensitivities, so figuring out a DSL would be interesting. For starters, tree matching ASTs would help a lot. Getting back on topic, what scared me (beyond the insanity of C++) from their talk was a description of one bug they were looking for: the simple use of booleans as numbers has required multiple security fixes.
2. Try putting this into your URL bar:
javascript:alert(function(){return 42;})
This, at first, sounds great. At a minimum, you can use it for better code fixing, but also to violate typical interface checks for some simple hot-fixes. Source-to-source translation for secure code has potential too: you can check whether the function you want to execute has the correct form.
There's a price for this typically trivially used and effectively non-critical feature: I can't guarantee that any of the tools I'm building to automatically make other people's programs better actually create a program that does the right thing. For example, as a first step in a wacky project, I'm exploring all possible interesting states of a JavaScript program. Instead of randomly picking new paths through the program every time I rerun it, I'm working on something(s) smarter. This requires me to compile the program down into a more analyzable form for ease of implementation and to add logging code so my analysis can get the information it needs. However, even adding simple log statements potentially changes the behavior of the final program with respect to the original beyond generating a log. If the original program converts a function to a string, inspects the result, and then acts based upon it, my added lines of code may have shifted around information the original needed. It *might* be possible to detect that some code is trying to do the introspection and interject the correct function text, but, realistically, that ain't happening.
While dynamic languages give a lot of freedom to developers, they also restrict them: it has hard to ensure an invariant, so we must be conservative with our funkier code. This is not just an expressibility issue: performance suffers significantly. Simply removing eval and introspection might be enough to enable significant automated code redistribution. Rich Internet applications can load faster using automated analysis and repacking tools than when we try to figure out script slicing and dynamic load ordering by hand. I found a couple of very nice cartesian product type inference projects for Python (meaning significant code speed increase) last semester, but, to my consternation, the authors concluded that certain dynamic code loading properties of the language destroyed any potential for them.
Agree with the point about instrumentation affecting the program under study. I am seeing something similar when running Valgrind + instrumentation for symbolic execution on Flash Player 9 right now...my tool finds failures which reproduce when Flash Player is run under Valgrind but not otherwise. (This is the first program for which I've found such behavior.)
ReplyDeleteTracking down what exactly happened is also a bit of a pain for me, since applying gdb to a program under Valgrind is a bit cumbersome. Not impossible, thankfully, but annoying. Does Firebug work well on your instrumented javascript?
Not sure how to get out of this problem. The first thing that comes to mind is try methods which are less invasive than static or dynami c rewriting, e.g. full system simulation. Those have their own downsides, though, and don't entirely solve the problem (e.g. see Skype's use of timing to compute jump targets!).
Interesting target of the Flash player - are you searching for interesting traits of the player (eg, synthesizing malicious programs) or of programs (eg, finding null ptrs)? I'm focusing more on the latter nowadays.
ReplyDeleteThat makes my issue simply of correctness of transformations.
I do think transformations are more desirable, however. The higher level the instrumentation, the more likely I can pull out solvable constraints (Prof. Dawn Song seems to perpetually struggle with the binary approach of BitBlaze even for relatively small programs). For example, hashtables are pervasive in ECMAScript programs; I wouldn't want to deal with solving for JMP statements after hash functions. There is the inevitable price of the transformation approach, however: foreign code is more likely.
The decision of which to use - or a hybrid - seems very dependent upon the problem. For example, one goal of my system is to be implementation agnostic, so low level instrumentation makes little sense.
My focus is on finding null pointers, integer overflow errors, memory safety violations/buffer overflows, and things like that. I do a little tiny bit of "type inference" on the machine code to look for signed/unsigned conversion errors, since at the machine code level I don't have access to the types. (Of course with debugging information I can get that, especially since Valgrind has a nice library for reading DWARF symbols...but I don't need it.)
ReplyDeleteWhile it would be interesting to synthesize malicious Flash bytecodes, doing that from inspecting binary traces is almost certainly not the right approach.
On the other hand, working with binary traces _does_ mean that you can get something out via symbolic execution if you can simply run the (dynamically instrumented) program. As you point out, some programming styles make it likely you will get less out rather than more. Computed jumps in switch statements are another area that I and David W. have been concerned about.
Still, there are benefits to a low level approach. I don't care, for example, that mplayer is written in C and Flash Player is written in C++. I also don't need to deal with the issue of code compiling, say, under one version of GCC but not under a newer version that supports better instrumentation. So from my point of view, being lower level by looking at machine code lets me be *more* "implementation agnostic" rather than less. I see your point, though, when trying to symbolically execute javascript and not wanting to deal with different browsers' interpretations of what the language constructs mean.
With respect to solveability of constraints, for what I do I haven't found that to be a bottleneck. Sure, some constraints are hard to solve, but if you're trying to find bugs and increase coverage, you can leave them alone and move on to the next ones! What we found with SAGE and I'm finding with catchconv is that there are _plenty_ of input-dependent constraints on a typical path which can be solved in less than 5 seconds. Our NDSS paper and the "Active Property Checking" TR have some numbers if you're interested.
I also just found a constraint which is hard for one of the solvers I'm using but not for others.
We should meet up some time - I wanted a binary analysis for a certain part of our stuff (both as proof-of-concept and to get the job done), but the effort required seemed unwarranted. However, considering you are doing a very related version anyway, that might lead to something fruitful :) It looks like you're more interested in verifying ECMAScript interpreter s than programs, but I find both neat and think the tasks can dovetail :) I'm not so sure inspecting binary traces won't let you synthesize malicious inputs, though without some sort of CFG inference (a la Dawn's work), doing partial order reduction would indeed make it infeasible. If we get it to work with ours, that might make an interesting side project.
ReplyDeleteI wasn't sure which NDSS paper you meant, so I just scanned the catchconv TR right now. The ability to only catch the small example surprised me (I bet that result's very dated..), but, more generally, for unsigned/signed mismatches - I would have thought abstract interpretation of source would have been a good fit (maybe ESP..). For something like verifying Firefox, the dev team tries to use standard conversion functions, so finding ambiguous points and labeling them would be reasonable (and in sync with what they're doing with Dehydra for instrumenting new ref counting techniques). That might even be a good side project to learn how to use Dehydra!
Ah, you probably meant the paper with Godefroid - I've been meaning to read that, actually.
ReplyDeleteYes, the Catchconv TR is highly dated. In particular we now do a much better job of pruning formulas before sending them to the solver. You want to take a look at the SAGE papers:
ReplyDeleteftp://ftp.research.microsoft.com/pub/tr/TR-2007-58.pdf
http://research.microsoft.com/users/pg/public_psfiles/active-checking.pdf
Would be happy to meet sometime. The Catchconv code is on sourceforge, too, so feel free to grab it and do what you want. You may also want to talk with Gautam Altekar, since he's been interested in symbolic execution stuff too...