Developing a Beginner's Roadmap to Learn Julia High Performance Computing for Data Science

brenhinkeller · July 3, 2021, 5:01am

Only partly joking – you really can’t go wrong with learning MPI.

While for very very large problems you may want to go with some sort of hierarchical parallelism approach with a shared memory model within nodes and a distributed memory model between nodes, it’s really the shared part that is optional. Latency between nodes is killer, and shared memory generally really doesn’t scale well beyond a single node. And if you want granularity in deciding exactly what information critically needs to be shared where, MPI is really the de facto way to do that.

Some other more general advice IMHO

Write like you’re writing in C. Do things manually as much as possible.
Amdahl’s law: know it, love it
Do use vector registers, if you’re on CPU. LoopVectorization is awesome for that. If your problem isn’t amenable to that, make it amenable.
While I’ve tried to avoid having to use GPUs, that’s probably a losing proposition long term. Especially given that “Xeon Phi” and such has not really taken off, and systems like Cori may be the last serious supercomputers to use them. Most (I suspect all) of the Exascale systems in development are getting a substantial majority of their flops from GPUs. Which is another reason Julia is great because I’d so much rather use CUDA.jl than raw CUDA. (That said, I still think there will be a niche for CPU-only compute for a long time yet – quantum chemistry anyone?)

Topic		Replies	Views
HPC / Julia, MPI / big data Julia at Scale	15	1554	October 13, 2020
Can Julia efficiently make use of 20+ cores for transforming hundreds of millions of rows for machine learning? Machine Learning question , big-data	27	3092	December 1, 2020
Questions on a number of code acceleration techniques General Usage performance , hpc , parallel	11	1807	July 8, 2017
State of distributed processing in Julia Julia at Scale	3	1655	May 14, 2019
How to choose a workstation for optimal performance Offtopic question , hardware	51	5394	November 13, 2021

Developing a Beginner's Roadmap to Learn Julia High Performance Computing for Data Science

Related topics