@lopezm94 nice to see that you’re interested in that project It’s true that Dagger.jl concerns itself with some of those issues… Some project ideas:
There is an important issue with the current scheduler to tackle ASAP: https://github.com/JuliaParallel/Dagger.jl/issues/53
Depending on how you want to solve it, this might constitute a whole GSoC project. I would suggest hooking into
Dask.distributed scheduler. This will not only fix that problem but bring a slew of new optimizations from
Dask.distributed (e.g. HDFS locality optimizations) to Dagger. Matt Rocklin, the author of dask might also be interested in mentoring such a project.
Another interesting experiment would be to try and implement something like Naiad, then try to use that as Dagger’s scheduler and compare its performance with the existing scheduler in Dagger which is based on dask’s shared-memory scheduler. I think this is a great project for the time scale of GSoC. But I believe it’s high-risk compared to the former. It’s unclear to me if this sort of scheduling is a good fit for array based abstractions we want to build up to in Dagger.
Another small task is to dust off the GPU support PR https://github.com/JuliaParallel/Dagger.jl/pull/33 (This may be too small to constitute a whole GSoC project on its own.)
I encourage you to read up the links and prepare a draft proposal if you are interested in any of this! New ideas are also welcome!
Updated: Changed ordering, formatting changes.