Thursday, May 27, 2010

p(category | language)

I've been data mining ~10 years of projects from sourceforge for awhile now to try to understand how languages spread and factors in people adopting them... It's been tricky, but I found the following charts pretty interesting:

(probability of the category for a project given the language)

(1: given a project with N developers, likelihood of being in a particular language, 2: likelihood of a project having N developers in language L)

(log number of developers with N projects)

(increasing use of a language for different tasks over 10 years -- 1: Java, 2: Python)

Next step: correlating factors to explain this stuff.
