First steps into parallel computing!

Hi all. Thanks for reading

I am totally new to parallel computing and working with different cores, workers, threads, etc (I have no idea of any of it hahaha). At home I have a single core mac and I have always worked like that. At the office we got a new server with 64 cores and now I want to turn some of my workflow to “parallel computing” to run things faster.

Now I have 2 simple cases, I hope you guys can help me change these into parallel and I will extrapolate to my necessities.

CASE 1: 1 Database, 4 Processes, 4 cores: Run 4 different processes in 4 different cores. Return the results to one core

Get Data: get 1e5 angles
Data = rand(0:1:360,10000)

In core 1: Fix data for calculations and wait for the results from other cores
Results=zeros(10000,4);
Results[:,1]=Data*pi/180 # from angles to radians.

In core 2: compute sin
A=sin.(Results[:,1])

In core 3: compute cos
B=cos.(Results[:,1])

In core 4: compute tan
C=tan.(Results[:,1])

In core 1: Receive all computations and save in main variable
Results[:,2]=A;
Results[:,3]=B;
Results[:,4]=C;

CASE 2: 1 Database, 4 Processes. Split the Database and send a piece to the cores. Return the results

Get Data: get 1e5 angles
Data = rand(0:1:360,10000)

In core 1: Fix data for calculations and wait for the results from other cores
Results=zeros(10000,2);
Results[:,1]=Data*pi/180 # from angles to radians.

In core 2: compute sin for the first 25 % of the data
A=sin.(Results[1:2499,1])

In core 3: compute sin for the second 25 % of the data
B=sin.(Results[2500:4999,1])

In core 4: compute sin for the last 50% of the data
C=sin.(Results[5000:10000,1])

In core 1: Receive all computations and save in main variable
Results[1:2499,2]=A;
Results[2500:4999,2]=B;
Results[5000:10000,3]=C;

Thanks!

No one?

Metareply: It would help if you would

  1. provide runnable code
  2. show what you’ve tried so far
  3. ask concrete questions (there is none in your OP)

This increases the likelihood that you will receive responses.

1 Like

You should start with the Threads.@threads macro, this is most like your last example, but it does it automatically:

Threads.@threads for i in 1:length(angle)
    Results[i, 1] = angle[i]*pi\180
    Results[i, 2] = cos(angle[i])
    Results[i, 3] = sin(angle[i])
    ... Etc
end

Threads is your first go-to when you have a single computer with multiple cores and want to loop over a large number of items in parallel (as the above example), it is the easiest to use when you already have a for loop and there are no race conditions.

You can use the Distributed.jl package to do the same thing with multiple processes, but this has a higher communication overhead as the multiple processes do not share memory.

For a more detailed look at multi threading - Multi-threading Julia Docs

1 Like

If you work with multiple threads the garbage collection might become a limiting factor, but that really depends on the problem you are trying to solve.
So run your problem single-threaded with @benchmark first and see if you can get rid of the allocations…

If not then running the code in multiple processes might be better…

Thanks man. That is very useful!

See also:

“A quick introduction to data parallelism in Julia” A quick introduction to data parallelism in Julia

4 Likes