Example for the Depth first multithread implementation performance gain as a motivation

stakaz · December 16, 2019, 10:39am

Hello, I have found the following discussion on Julias multithread task scheduler implementation:

https://news.ycombinator.com/item?id=20507628

As far as I can read from the comments, the main advantage and the “new thing” is the “depth-first approach” for schedule the tasks.

If this is a real advantage for numerical computing, May it would be great to see an example and possible even add this example to the start page of julia (or at least at an important place on the web page).

Maybe someone has a good idea for such a thing and real comparison in speed gain compared to the breadth-first approach.

I thing this could greatly improve the attractiveness of Julia and show a huge difference to all the other available languages.

Tamas_Papp · December 16, 2019, 12:00pm

My understanding is that there is no speed gain compared to carefully written threaded code. It’s just that doing the right thing becomes much, much easier. Not unlike automatic memory management vs manual allocation.

While I find the new multithreaded implementation in 1.3 amazing, the discussion linked above makes me skeptical about its value for advertising Julia. The typical response is “language X had it this in 1962”, without investing any effort in understanding what is going on. In this respect, HN is almost as bad as Slashdot. This is what it must feel like for an electrical engineer to talk to “audiophiles” EDIT scrolling past these I realize that many people do get the idea, but perhaps since they don’t generate discussion they are sorted down.

stakaz · December 16, 2019, 12:06pm

Ok, that is something I didn’t thought about Well, but perhaps a comparison of depth-first and breadth-first approach for a N cores with N tasks which again spawn N tasks (as in one of the comment by Stefan) would still be a good example on how “well-suited” julia is by default (without knowing anything about multi-threading at all )

tkf · December 16, 2019, 3:12pm

Early termination in the parallel reduce I implemented in Transducers.jl depends on the depth-first scheduler. Ref: Thread- and process-based parallelisms in Transducers.jl (+ some news) - #3 by tkf

Topic		Replies	Views
Overhead of `Threads.@threads` Performance question , multithreading	30	5357	March 13, 2021
Question regarding Julia's planned approach to composable, safe, and easy multithreading Internals & Design multithreading	14	3770	December 23, 2019
Huge performance fluctuations in parallel benchmark: insights? Performance parallel , multithreading , benchmarktools	52	2629	December 1, 2021
Multi-threading on a 2 CPU system New to Julia multithreading	15	1082	February 2, 2023
Notes on multithreading with Julia Teaching & Outreach parallel , multithreading	5	1287	June 29, 2020

Example for the Depth first multithread implementation performance gain as a motivation

Related topics