I’m checking out the HPC projects in the Julia GSoC idea section but there doesn’t seem to be any proposed mentors. I have experience in Julia and would love to get more expertise on programming models and distributed computing, is there someone I could contact for the Dynamic distributed execution for data parallel tasks in Julia project?
Any leads on where I could find someone would also be helpful. Thanks!
Depending on how you want to solve it, this might constitute a whole GSoC project. I would suggest hooking into Dask.distributed scheduler. This will not only fix that problem but bring a slew of new optimizations from Dask.distributed (e.g. HDFS locality optimizations) to Dagger. Matt Rocklin, the author of dask might also be interested in mentoring such a project.
Another interesting experiment would be to try and implement something like Naiad, then try to use that as Dagger’s scheduler and compare its performance with the existing scheduler in Dagger which is based on dask’s shared-memory scheduler. I think this is a great project for the time scale of GSoC. But I believe it’s high-risk compared to the former. It’s unclear to me if this sort of scheduling is a good fit for array based abstractions we want to build up to in Dagger.