HPC domain

I know there is an existing distributed/parallel category, but I thought it might be a good idea to have a place to discuss more deployment issues on HPC machines (working with shared file systems, launching jobs, working with dependencies, etc.)

cc: @vchuravy @andreasnoack

16 Likes

Oooh! Ooooh! Me! Me! I have my hand up!
I definitely think you are correct in saying we need more discussion of HPC deployments.
For one thing I never hear about the Modules environment on here - Modules lets you use different versions of software packages on HPC or other systems, by flexibly setting environment variables.

However can we not form another partition within the parallel/distributed category?
The counter argument to that one is that not all HPC is parallel - there is plenty of embarassingly parallel work and deep learning happening on HPC setups.

A more radical proposal - rename Parallel/Distributed to Performance or Julia at Scale or something like that. Then we can all co-exist below that?
I hesitate to call it HPC as that rather sounds like a land grab.

IF there is a separate HPC Domain set up I will join in of course.

1 Like

I like “Julia at Scale”.
The question with “high performance” is that it implies that SIMD, GPU, and other node perf. domain should also be included. It is totally fine since node performance is often the biggest part of the work for HPC codes. This tendency continues for more than 20 years now. Considering the power of Power9+multi V100 nodes on recent supercomputers it is rather difficult to pretend that high performance is nor relative to a single node…

@LaurentPlagne I agree with you.

I updated the name of the Parallel/Distributed domain to Julia at Scale and added a short blurb what it is about at About the Julia at Scale category (note that is a wiki post so changes are welcome)

3 Likes