Ok. In this case one may randomize the order of silver application to make sure that the branch prediction plays no role.
Likely that m is “mid”
When you do the changes described in that section, does it run ok for you? It does not for me. I have a large memory allocation, and the code is actually slower. I am probably doing something wrong. Showing the full code in that blog would have maybe been better for new people like me who are trying to learn and reproduce their “experiment”.
To chime in on your issue that your code provides no speedup VS the code provided as a solution.
I found this thread to be helpful: For loops run SUPER slow in Julia when outside a function - Stack Overflow
and I believe it provides hints as to why you did not see the expected results in your code.
Hi @GeoKou , thank you for the link but I don’t get it. All my loops are inside functions.
It seems, after testing it myself, that the difference that enables the desired behavior is having the “partition” part of the code in its own function.
I think it is not exactly the issue in the thread I sent, but a similar one; by having the while loops in a function that is called on each thread, julia must be putting up “safety” mechanics to prevent any issues (even though the code is written in a way that no such issues should arise), and this slowdown is removed by having the while loops (the partition function) in their own function.
Ok let’s assume you are right, then if you look at the blog sent by @blackeneth you can see that the parallel mergesort has a loop in function body and the code is running perfectly fine in parallel. So how do you explain this?
Also, interestingly if we move the 2 while loops in mergesort to a function and call that function inside parallel merge sort then it slows down the function. So we have here the opposite effect
I spawned a new thread, with the two codes in one, here:
@gitboy16 or anybody else, please post there the mergesort with the opposite behavior.
This is a nice piece of code. I hope I have your permission to borrow it? Thanks.
I teach for 25 years a Parallel Computing class to advanced engineering students. I’ve been toying with the idea of using Julia as a common “algorithmic notation” to describe parallel algorithms and ideas, --executable pseudocode-- as I call it
Since Julia 1.8, I think I got all I need.