Graph computing benchmarks: comparing the scalability of Dask, Dagger.jl, Tensorflow and Julius

Graph computing solutions like Dask/Dagger.jl are gaining popularity among developers, mainly because of their easy built-in distribution capabilities. If you are considering one of these graph computing solutions for your next project, but are uncertain about how well they can scale to handle real world use cases, this benchmark has the answers for you.

Overall, Julius scales 100-1000 times better than the best alternatives depending on the problems, making it the only suitable graph solution for enterprise use cases. I know the numbers sound too good to be true, but you can verify it yourself by signing up for developer access to Julius here.

Comments and suggestions are very welcome!

4 Likes

Hi @Yadong_Li! I’ve taken an interest in this benchmark over the last week, and will be publishing a blog post providing some commentary on these results from my perspective as Dagger’s maintainer, including how I improved Dagger’s runtime on the benchmark with some performance optimizations. I’ll provide a link once it’s published!

One thing I found in the process of working with this benchmark is that Dagger and Dask are actually receiving an unfair advantage: in both y_n and s_n implementations, the final spawned task is not waited on/fetched, resulting in the benchmark function returning before all computations have completed. This might not be an issue in Jupyter (maybe they’re automatically fetched to print the result of the final value?), but it is problematic when running from the REPL or a regular script. I have some updated benchmark scripts for Dagger and Dask at Julius Graph Benchmarks: Dagger and Dask · GitHub. I would love if you could use the relevant pieces from those updated scripts to update the benchmark results!

Anyway, great work on these benchmarks, and I hope we can work together to improve the graph computing experience in Julia going forward!

@jpsamaroo thanks for pointing out the missing fetch, indeed I was running them in Jupyter and it seems the results were automatically fetched by Jupyter. I will update the benchmark results according to your script.

Yes, we are more than happy to work with you to provide the best graph computing tools for Julia developers!

1 Like

hi, @jpsamaroo , I have updated the benchmark using the latest master version of Dagger.jl and included the relevant fetch/compute calls in Dagger.jl/Dask. The latest Dagger.jl is significantly faster than the older version, great work in speeding it up! Please take a look at the latest results and let us know if you have any additional comments or suggestions. I noticed that Dagger.jl throws errors for large N (> 100,000), I have reported the error in your github gist link above.

1 Like

Thanks a bunch for also re-running with Dagger master! Your results match up with what I got.

The error looks like a race somewhere in MemPool or Dagger, but I didn’t encounter any of those on my system while running benchmarks up to 500K. Might have something to do with the Julia or MemPool versions being used; I ran with Julia 1.7.2 and the latest MemPool.

I’m using Julia 1.6.6, the LTS version, and MemPool v0.3.9. For me, the error happens sporadically, some time it happens for smaller N as well, but for N>200K, it occurs almost every time,