Note: this is a rewrite of the original posting as the blogger interface confused me into thinking the edit an article button was just a comment button.
"NESL": Seminal work on data parallelism that handles regular structures like matrices very well. However, super common structures like trees, graphs, etc. are supported only in hard-to-read encoding or overly restricted (e.g., fixed depth) legible ones. Front page of google scholar: ~600 citations.
"Flattening Trees": shows that recursive types -- such as trees -- are no big deal to add at the interface level and that the flattening transformation can be modified to support them. ~22 citations and what seems like the real first step towards making flattening acceptable for general purpose languages. "More types for nested data parallel programming" finishes the job. (Nit: conflating nested data parallel with flattening is bad; these papers are really about the latter which is just *one* approach to the former).
Bonus read: "Inductive Graphs and Functional Graph Algorithms" shows how to (sequentially) support graphs. It's interesting in that 1) graph specialization is restricted to an ADT that could be optimized external to the language and 2) similarly, I suspect nested data parallelism can be reasonably achieved as library with it. At least for the near future, having the work be at the library level is a big deal: API changes are more forgiving.