Run a julia application at large scale (on thousands of nodes)

Thanks @JaredCrean2! This would certainly be the most complete solution to this problem. However, it seems to require quit some work to get this properly working.

Thus, I would prefer a solution that is only partial but simpler. I would like
to find a simple technique to avoid congestion effects when thousand of processes run a Julia application and therefore need to access the same files. It is not a big issue if each process does some compilation as long as this does not interfere with other processes and cause congestion at scale. To put it simple, a certain constant job setup time is fine; however, it should not significantly increase at large scale.

I would hope to find a solution along the following lines:

  1. Precompile all required modules by invoking Base.compilecache(modulename) [1].
  2. Broadcast the precompiled modules, the julia exe (and its dependencies found with ldd) and the main application code [2] to the RAM-disks of the compute nodes.
  3. Set the Julia variable DEPOT_PATH[1] to point to the local RAM-disk (the precompiled modules must be inside DEPOT_PATH[1]/compiled/ [1])

To my understanding, there is (at least) one issue with this: the Julia processes will not directly take the precompiled modules from DEPOT_PATH[1]/compiled/, but they will first check inside DEPOT_PATH[1]/packages/ if the module needs to be precompiled again. So, I would like to ask you:

  • Do I understand the situation right?
  • If yes, is there a way to deactivate this check, i.e. to force the usage of these precompiled modules without any check?
  • And if this check cannot be deactivated, would a symlink in DEPOT_PATH[1]/packages/ to the original packages path (~/.julia/packages/) fix it (unfortunately, then all processes would need to access these same files simultaneously what could cause congestion)?
  • Do you see any other issues with this approach? What else will all Julia processes need to access from the installation on home?

Thanks!!!

[1] Modules ยท The Julia Language
[2] The majority of the application code should also be put in a module itself to have it precompiled (there could even be just a call to main() in the application code).

1 Like