I've been working with Apache Spark quite a bit lately in an effort to bring it into the fold as a viable tool for solving some of the data-intensive problems encountered in supercomputing. I've already added support for provisioning Spark clusters to a branch of the myHadoop framework I maintain so that Slurm, Torque, and SGE users can begin playing with it, and as a result of these efforts, I'v