If neither vtune nor callgrind/kcachegrind work by noon tomorrow, I'll be one sad puppy.
That's just a detail. What's really been bugging me is how to do performance modeling of my algorithms on cellphones that won't exist for another 5 years. A 4-core x 2-threads-per-core Nehalem is a bit of a stretch for a pocket PC... but if I show strong scaling on such a 2.66Ghz beast, will I still get the same on a lighter 4-8 core ARM? E.g., relative communication cost might be cheaper on an ARM, so scaling should be better, though a smaller cache size might reverse this entirely. Simarily, beyond L1 and L2, phones aren't NUMA: I need a machine without socket jumps (or, as Sam suggested, blur together the address space). The world for a 4-8 core ARM...
In other news, I bumped into a quickly written paper about an unnecessarily complex idea that got the same results as one of my projects (using similar evaluation metrics, even), yet mine was vehemently rejected from the very same conference. Considering the conference is respectable, that's pretty damning for my writing style. I wonder how much longer it'll take for me to be able to write convincingly; this is a ridiculous setback to have. Hurray grad school!
I need to get a move on in finding a place in Capitol Hill..
Chicago has a course called Academic/Professional Writing that's designed to improve your professional writing. PhD students were highly encouraged to take it. Any idea if Berkley might offer such a course?
ReplyDeleteCharlotte