Wednesday, May 7, 2008

Social Software

Almost all stages of software design, except for the explicit task of writing code, now takes account for the social reality of releasing big programs for many users. We have smart editors to ease navigation of thousands to millions of lines of code (note: what's the most anyones fit into an IDE?), distributed version control systems, and, increasingly, distributed testing and even compilation. Maybe we're still missing something within the code itself?

Taking time away from coding my JS simulator, I saw a link to a scary Mozilla/Firefox bug report: a developer put a virus into the Vietnamese language pack extension. This isn't a novel scenario - we periodically see CDs, those concrete checkpoints of quality backed by printing costs, being pressed with viruses in them. Looking further in the bug report, we see an unsettling reality: virus definitions are updated every 6 hours, and it takes a long time to check software against them.

My gut reaction is to want to ensure virus checkers are incrementalized - but that just perpetuates old model development. Fortunately, feature-wise, there are a lot of developers on several big open source projects. Unfortunately, security-wise, that's a whole lot of unknown people. Many projects employ a developer tier system to manage the varying layers of trust: you start out only doing bug reports, then bug fixes which get reviewed, and then in charge of specing features, creating them, and code reviewing for them, or even simply managing others. However, this is a very fallible process, and susceptible to subversion.

It does allude to a basic principle: trust builds over time.

Now, I'm wondering - can we incorporate this notion of varying levels of trust of developers in a modularized manner in terms of code capabilities? For example, perhaps code from a new developer can only run in a sandbox, and after the developer is trusted, the same contributed code will compile to be outside of it and thus run faster?


Asa Dotzler said...

In this case, the community-built add-ons process is not the same as and not nearly as rigorous as the process for Mozilla products themselves (like Firefox.)

Every line of code that gets checked into Mozilla's code repository goes through several human reviews, line by line, so this wouldn't have been able to easily slip into the Mozilla product. Also the machines that compile the release binaries aren't "user" machines exposed to the horrors of the Internet.

With community-built extensions, the author writes it, uploads it, it's scanned by anti-virus software at upload and then enters a community testing area called the sandbox. Testers download, install and test out the software but don't perform code inspection. If testing goes well, then an editor for the site will make the add-on public.

Leo Meyerovich said...

Why not run such code in a sandbox, perform distributed tracing (which minimizes logging cost), and, after an incubation period, have it run out of the sandbox?

This would help highlight code that should be quickly examined as it is popular, and give more protection with some notion of accrued trust for less popular extensions. The interaction cost would be minimal: a month into using an extension, a dialog can inform the user "500 users have used this for over a month; do you want to increase performance at the risk of security by running it out of the slower but security-monitored safety mode?"

Finally, even vetted code will have bugs. NASA and their vendors do some of the most stringent code analysis, and while they only let a few through, they deal with both homogeneous hardware and spend more time both spec'ing and testing. The more I think about it, not protecting programs against new code (in general) seems like insanity, given the option.

Sergey said...

I must admit I like the ideas implemented in your article.