GSoC 2017: Ensure that Julia runs smoothly on current large HPC systems

I’m interested in Julia project in GSoC 2017. I would like to participate in project “Ensure that Julia runs smoothly on the current large HPC systems”. If I understand your project description properly, then I have three ideas about what you would like to get as a result.

  • First, we can start an interactive job on the cluster by allocating a limited resources via the interactive qsub mode. The advantage is that it is easy to integrate. The disadvantages are that you can not change the set of resources dynamically and not all clusters allow interactive sessions.
  • The second idea is that we start the workers on-demand, and they connect to the coordinator listening on a particular port on client machine. The advantage is that It is more effective in using resources. However, worker can run for a long time if the task is overloaded. Moreover ports can be closed.
  • The last approach is to start each Julia command in batch mode through the cluster job queue and wait for the job to be completed. Advantages: easy to implement. Disadvantages: it will be slow.

I believe the second option is the most balanced.

Finally, do I need an access to cluster or will it be provided?
Looking forward to your reply,
Anton Gavrikov

1 Like