Developing a Beginner's Roadmap to Learn Julia High Performance Computing for Data Science

Only partly joking :wink: – you really can’t go wrong with learning MPI.

While for very very large problems you may want to go with some sort of hierarchical parallelism approach with a shared memory model within nodes and a distributed memory model between nodes, it’s really the shared part that is optional. Latency between nodes is killer, and shared memory generally really doesn’t scale well beyond a single node. And if you want granularity in deciding exactly what information critically needs to be shared where, MPI is really the de facto way to do that.

Some other more general advice IMHO

  • Write like you’re writing in C. Do things manually as much as possible.
  • Amdahl’s law: know it, love it
  • Do use vector registers, if you’re on CPU. LoopVectorization is awesome for that. If your problem isn’t amenable to that, make it amenable.
  • While I’ve tried to avoid having to use GPUs, that’s probably a losing proposition long term. Especially given that “Xeon Phi” and such has not really taken off, and systems like Cori may be the last serious supercomputers to use them. Most (I suspect all) of the Exascale systems in development are getting a substantial majority of their flops from GPUs. Which is another reason Julia is great because I’d so much rather use CUDA.jl than raw CUDA. (That said, I still think there will be a niche for CPU-only compute for a long time yet – quantum chemistry anyone?)
5 Likes