Two bits of clarification. Are these two statements correct?
-
All parallel functions require that the user has started worker processes separately with addprocs().
-
You need to add at least 2 worker processes to see benefits of parallelization because the main process does not act as a worker itself.
If so, I intend to edit the docs to better convey these views.
1 Like
Hi TurboNick,
I think I can help. Can you tell me what type of parallelization you’re looking to use? There are several right now in the language (@threads, @spawn, etc.).
In general, there are two ways to start more than one parallel thread that come to mind immediately (please, someone add to this list if I miss something). One can start the julia REPL with julia -p 4
for four processes (or however many you want). Alternatively, you can set an environment variable in your terminal prompt, .bash_profile, .zshrc, etc. if you want to always invoke this as export JULIA_NUM_THREADS=4
or whatever number of threads you want.
If you then set the number of processes, you should see parallelization if everything goes well. If you can be more specific about what you’re seeing and what you’re looking to parallelize, then more can be said.
Thanks @swishmas. I’m exploring several variants for their impact and basing it on an already-running julia process because I think that’d be easier for anyone using my code.
So at the moment I’m looking at pmap
and @distributed for
. Multithreading instead of multiple processors might be more natural, but I gather that’s still being tested?
Do you know whether the two statements in my first post are true?
That is, distributed/pmap with 2 processes (1 worker) would of course allow you to do other things on the main process while your code is running on the single worker, but would not itself be in parallel. So you need 3 processes (2 workers) to execute in parallel?
Sorry for the slow reply. Hopefully this is still useful.
As I understand it, if you designate 4 threads, then the “master thread” is included in that total. When I parallelize over 4 threads, 4 threads are used on my machine.
I’m less familiar with @distributed
, but I can comment on pmap
and @threads
. For pmap
, the only trick is to initialize julia with julia -p 4
and this gives parallelization over 4 threads. For Threads.@threads
, I start with the environment variable I listed above. Again, =4
means I get 4 threads that are parallelized over.
As I understand the current status, the parallelization is still in testing. There’s a new @sync
system in v1.3 (and an earlier version or two), but the problem is that the new parallelization system can be slower since it is not optimized in terms of allocations. So, a code written in v1.1.1 will be faster than v1.2/1.3 (I found a x2 slow-down between those versions). This is supposed to be fixed, but it’s not known when that will be completed.
As for your last paragraph, I suppose the way I would think about it is whatever happened on that third process should be counted as a third process to be initialized in parallel with whatever is running on the other two threads. So, I would write the function with something like for i = 1:3
and then have an if
statement saying that the i=3
case gets the sequential, independent commands of the other two threads. That would be where I start, but you might find something more efficient as you get more feedback and develop more code.
Good luck!
1 Like