I have MWE as below, in which I am manipulating Do Some Works on parallel using threads that are synchronized between via channel. However, this channel seems to be not fast and I am wondering if there is another synchronization primitive directive that can lead to faster performance?
function KSD()
time_loop = true
channels = [Channel{Nothing}(0) for i in 1:(nthreads()-1)]
thrs = [@tspawnat i begin
while time_loop
take!(channels[threadid()-1])
println("Do Some Works From Thread $(threadid())")
take!(channels[threadid()-1])
end
end for i = 2:nthreads()];
for t in 1:1000
put!.(channels, nothing)
println("Do Some Works From Thread $(threadid())")
put!.(channels, nothing)
end # for t in 1:3
time_loop = false
fetch.(thrs);
return nothing
end # function KSD()
KSD()
What makes you think that this is (too) slow and further, that channels are the culprit?
Your KSD can deadlock, i.e., when a thread enters the while loop again before time_loop becomes false it will block on the channel and accordingly fetch never succeeds. In general, coordinating threads via shared variables is very difficult and error-prone … that’s why we have channels.
In any case, you can also try lock for thread coordination which is somewhat more low level than channels.
I read that in a blog which it is mentioned that channel could be very slow.
I don’t have a clue how to apply lock in this MWE to synchronize the the threads (only used them to secure modifying shared variables). Could you please help me here or at least give me the idea?
Had no particular application in mind, just recalled that locks can be used to implement some higher-level synchronization constructs. Imho channels are much better and easier to use. I see no reason to avoid them. In any case, if parallel performance is the main concern any synchronization needs to be reduced to a minimum no matter which construct is used.
I’m also not quite sure what exactly you are trying to do in your MWE, i.e., why do you need control over where a thread runs and start and stop its work explicitly? Channels are often nice for pub-sub type concurrency or some simple work stealing, i.e., a producer just publishes work items and several workers take a new one whenever ready. When all tasks need to catch-up with each other something like a barrier might be used (which in turn can be implemented using locks or channels).
There is a semaphore in Base. Don’t know if it is faster, but I’m also not aware that channels are particularly slow – which version of Julia was that blog talking about? Did it provide any evidence?
Anyways, as has been suggested several times in this forum when discussing parallel processing: First, seriously optimize your single threaded code. Then, when understanding its limitations/tradeoffs and memory requirements, make it multi-threaded. At all stages benchmark your changes and decisions, i.e., are channels really holding you back here? There are certainly many more people willing to help if you can post a working example of your best efforts on your actual problem (without guessing on where the bottlenecks might be).
The do notation you use with acquire is syntactic sugar, i.e.,
Base.acquire(sem) do
println("Do Some Works From Thread $(threadid())")
end
is the same as
Base.acquire(() -> println("Do Some Works From Thread $(threadid())"), sem)
and this method of acquire does the following:
Acquires the semaphore
Calls the function given as first argument
Releases the semaphore – no matter if the function returns normally or throws an error.
(In case you know Python, the do-notation in Julia is often used in a similar fashion as the with resource managers in Python).
Thus, when using the do-notation, you don’t need to call release explicitly. I.e., you can either write
Base.acquire(sem) do
println("Do Some Works From Thread $(threadid())")
end
# Note: No explicit release needed
or just use
Base.acquire(sem)
println("Do Some Works From Thread $(threadid())")
Base.release(sem) # Note: Might not be called if previous line throws an error!
In general, the do-notation is preferred as you cannot forget the release and it also releases when an error is thrown with the semaphore held.
Have not done such low level parallel programming for quite some time, but apparently semaphores can be used to implement higher-level constructs such as barriers (see The little book of semaphores for details). Overall, getting these things correct is rather difficult and generally channels are much easier to use and less error-prone. There is also nothing wrong with channels … what makes you still believe that they are slow? Do you have any benchmarks on that?
Thank you for your feedback, and I followed what you suggested.
However, I don know why the resuls are not as per form?
function KSD()
time_loop = true
sem = Base.Semaphore(nthreads())
thrs = [@tspawnat i begin
while time_loop
Base.acquire(sem) do
println("Do Some Works From Thread $(threadid())")
end
end
end for i = 2:nthreads()];
for t in 1:1000
Base.acquire(sem) do
println("Do Some Works From Thread $(threadid())")
end
end # for t in 1:3
time_loop = false
fetch.(thrs);
return nothing
end # function KSD()
KSD()
Do Some Works From Thread 1
Do Some Works From Thread 1
Do Some Works From Thread 1
I only used channels before and in blog it mentioned that. I will look it up and share it.