I've been data mining ~10 years of projects from sourceforge for awhile now to try to understand how languages spread and factors in people adopting them... It's been tricky, but I found the following charts pretty interesting:

(probability of the category for a project given the language)

(1: given a project with N developers, likelihood of being in a particular language, 2: likelihood of a project having N developers in language L)

