How to run two tasks on parallel?

if I have two tasks and I need to them on parallel. What is the organization to do that in Julia? For example the following

using LinearAlgebra, SparseArrays,
n = 1000; d = 10;
A1 = sprand(n,n,d/n); x1 = rand(n);
A2 = sprand(n,n,d/n); x2 = rand(n);
A1*x1; # task-1
A2\x2; # task-2

Threads.@spawn is the usual way to do it. But doing a quick read up of Parallel Computing · The Julia Language and Multi-Threading · The Julia Language will probably get you up to speed.

2 Likes

Thanks for your feedback but actually, I did not know how to do that. Cause all the documentary refer to implement parallel in terms of for loop.

Here is my best (nonworking) attempt with asynchronous programming:

function f(n, d)
        A1 = sprand(n,n,d/n); x1 = rand(n);
        A2 = sprand(n,n,d/n); x2 = rand(n);

        # I'm not sure why @task puts things out of scope
        local r1, r2;

        # Create the tasks
        # Unfortunately, these don't return a value
        t1 = @task r1 = A1*x1;
        t2 = @task r2 = A2\x2;

        # Schedule and wait
        schedule.([t1, t2]);
        wait.([t1, t2]);
end

#n = 1000; d = 10;
#r1, r2 = f(n, d)

The best approach depends on your exact problem. It looks like you’re working with tasks that have widely different run times, and that will be significant as you scale things up.

I agree that most of the tools and documentation in Julia are intended for tasks that can be broken up evenly and iteratively. Figuring out how best to solve your particular problem might take a lot of creativity.

2 Likes

Tasks are proper for interacting with the outside world, such as downloading a file. Threads can be used instead:

using LinearAlgebra, SparseArrays
import Base.Threads

n = 1000; d = 10;
A1 = sprand(n,n,d/n); x1 = rand(n);
A2 = sprand(n,n,d/n); x2 = rand(n);

thr1 = Threads.@spawn A1*x1;
thr2 = Threads.@spawn A2\x2;

result1 = fetch(thr1)
result2 = fetch(thr2)

In that case, one should start Julia with multiple threads initialized by

$> julia -t n

where n is the number of threads.

6 Likes

Thanks for your feedback.
Actually, I am not having values for r1, r2 any reason for that?

I’m wondering the same thing. I guess using @task changes the scope, but I wasn’t able to find a workaround.

@jbytecode recommends using threads for this sort of problem, and I think I agree.

2 Likes

Thank you very much!
How to organize the flow if there is overlap between the two threads. For example, if the execution of one thread at each time-step relates to the output of the other thread, as below.

Your feedbacks is really appreciated.

using LinearAlgebra, SparseArrays, .Threads


tmin=1;
tmax=10000;
D1 = sprand(10000,0.1);
D2 = sprand(10000,0.1);
X = zeros(10000);
Y = zeros(10000);
for n in tmin+1:tmax
   X[n] = D1[n]*D2[n] + Y[n-1];      # I want to make it as Thread-1
   Y[n] = D1[n-1]*D2[n-1] + X[n-1];  # I want to make it as Thread-2
end

You should probably check out Dagger.jl (Home · Dagger.jl). You can essentially spawn tasks that depend on previous tasks and it will only start executing once all the dependencies are also finished.

EDIT: Probably not needed here, but may be of use to OP in future.

1 Like

Although, your problem is probably better suited to just spawning threads:

for n in tmin+1:tmax
   Xn = Threads.@spawn D1[n]*D2[n] + Y[n-1];
   Yn = Threads.@spawn D1[n-1]*D2[n-1] + X[n-1];
   X[n] = fetch(Xn)
   Y[n] = fetch(Yn)
end
2 Likes

Thank you for your valuable response!

  • Is creating the task inside the for loop time-expensive?
  • Does include the for loop inside each task and make them communicate at the end of each iteration is more efficient to exclude the overhead? If so, what is the organization for that?

Executing things in parallel always comes with a cost. The way to alleviate this is to make sure you only use it when the operations are much more expensive than the overhead. I would suggest benchmarking both approaches (until that point, it’s just a heuristic).

In the specific code example, it seems that there are only some floating point operations happening, which are incredibly cheap. Trying to parallelise something that small will probably lead to a 10-100x slowdown, as the overhead is so much larger.

In this case, since the next iteration of the for loop directly depends on the previous iteration, you will only be able to do at most two tasks at once, and even then, the tasks are short.

In this case, I think optimising for serial performance will be better than trying to do it in parallel. A problem that is theoretically able to be parallelised often should not be as it causes huge performance hits in other areas.

To move forward, I would move both implementations into two functions and benchmark both with something like BenchmarkTools.jl

3 Likes

Thanks for your feedback.
If I want to make one task execute (while the other is waiting for it). can I do as in the below using the parameter mutex? If no, what is the suitable technique?

function thr1(tmin,tmax,mutex,sum)
   for n in tmin+1:tmax
      if mutex == false
         sum += 1;
         mutex = true;
      end
    end
end

function thr2(tmin,tmax,mutex,sum)
   for n in tmin+1:tmax
      if mutex == true
         sum += 2;
         mutex = false;
      end
    end
end

sum = 0;
mutex = false;
tmin=1;
tmax=10000;
x1 = Threads.@spawn thr1(tmin,tmax,mutex,sum)
x2 = Threads.@spawn thr2(tmin,tmax,mutex,sum)
fetch(x1)
fetch(x2)

You want to use a SpinLock instead of a binary variable (Multi-Threading · The Julia Language).

lock(myspinlock) do
... Calculations
end

Instead of the if mutex==true. This lock lets a thread wait.

1 Like

Thank you very much, I will have a look!

I want to double check the execution inside the for loop.
1- create a spawn and assign it to Xn (does the execution start immediatly?)
2- create another spawn and assign it to Yn (does the execution start immediatly?)
3- Wait until Xn finishes
4- Wait until Yn is finishes
Please correct me if wrong.

Yes, the first two lines inside the loop immediately schedule the task, and moves on with execution on the current thread. The fetch just waits for the result as the next iteration of the loop depends on the current one finishing.

1 Like

Thank you very much for your confirmation.

1 Like

[/quote]

I tried to make the above sequential code work on parallel as below (I know that there could be another method). However, it freezes (probably at one of the two while loops).Could you please correct the code to make a change in flag, X,Y in one thread be seen with the other thread?

function thr1(tmin,tmax,D1,D2,X,Y,flag,c)
  for n in tmin+1:tmax
    if flag[] == false
      X[n] = D1[n]*D2[n] + Y[n-1];   
      lock(c)
      flag[] = true;
      unlock(c)
    end
    while flag[] == true
    end
  end
end

function thr2(tmin,tmax,D1,D2,X,Y,flag,c)
  for n in tmin+1:tmax
    while flag[] == false
    end
    if flag[] == true
      Y[n] = D1[n-1]*D2[n-1] + X[n-1]; 
      lock(c)
      flag[] = false;
      unlock(c)
    end
  end
end

c = Base.Threads.Condition();
flag = Ref(false); 
t1 = Threads.@spawn thr1(tmin,tmax,D1,D2,X,Y,flag,c)
t2 = Threads.@spawn thr2(tmin,tmax,D1,D2,X,Y,flag,c)
fetch(t1)
fetch(t2)

In general, I wouldn’t use this approach, but instead the one here - How to run two tasks on parallel? - #10 by jmair.

I hesitate to correct the code, since using locks is probably not the way forward for this example.

1 Like