Hi all — Firstly, hope I’m posting this in the right place, I’m not much of a Discourse user, so if I’m not adhering to the community guidlines, please let me know and I’ll take this elsewhere. Now for the meat of the matter: I’m recently getting started with Julia for a scientific computing project I’m working on, and I’m encountering something that’s quite puzzling to me. I have some code that looks like this
function foo(x)
# do some stuff and return a positive integer
end
function bar(x, y)
# do some stuff and return a positive integer
end
function process_data(data)
arr = [0, length(data)]
for x in data
push!(arr, foo(x))
end
end
function process_array(arr)
new_arr = Vector{Int64}[]
N = length(arr)
for i in 1:N-1
push!(new_arr, bar(arr[i], arr[i+1]))
end
return new_arr
end
function all_together_now(data)
intermediate_results = process_data(data)
println(intermediate_results)
final_results = process_array(intermediate_results)
end
Here’s the catch: this code works as I expect it to. But the println in final function was only meant to be there while I was building the algorithm out, to help me make sure foo() did what I thought it was doing. When I comment out the println, the code breaks, and rather than the intermediate results I expect, it’s as if the loop in process_data never executed! I’m left with just the [0] as an output. This seems totally nonsensical to me as a bug, but the commenting or uncommenting that single println is the only thing I’m changing. Has anyone experienced an issue like this, or perhaps more likely, am I missing something obvious?
What’s the output of methods(all_together_now)? I’m guessing you may have multiple methods defined and may need to simply restart your session…
EDIT: I should have run your code before commenting, it looks like there are other issues. I think a more idiomatic way to accomplish something like this would be to take advantage of array comprehensions, e.g.:
function foo(x)
return x
end
function bar(x, y)
return x + y
end
function process_data(data)
return [foo(x) for x in data] # array comprehension
end
function process_array(arr)
return [bar(arr[i], arr[i+1]) for i in 1:length(arr)-1]
end
function all_together_now(data)
intermediate_results = process_data(data)
final_results = process_array(intermediate_results)
# note that explicitly returning is optional
end
I tried restarting my session but that doesn’t seem to solve the problem unfortunately. This is the output from the methods() call, (dispensing with the example name of all_together_now). The arguments x, Q, W are all vectors, and c could in principle be a function that would affect the behavior of foo().
julia> methods(compute_change_points)
# 2 methods for generic function "compute_change_points" from Main:
[1] compute_change_points(x, Q, W, c)
@ ~/.julia/dev/Megafauna/src/ChangePoints.jl:88
[2] compute_change_points(x, Q, W)
@ ~/.julia/dev/Megafauna/src/ChangePoints.jl:88
Ah! You’re right, process_data should return something, but it does in my actual code, so that isn’t the issue unfortunately. Apologies that my example is not so good.
Yep, this is fine. (Also, welcome to the community!)
Could you post a minimal working example? I.e. some code we could run, which works fine if the println is included, and fails if not?
If I ‘fix’/complete the posted code, I don’t get a difference in behaviour whether I include the println or not (except obviously an extra print).
Code and output
function foo(x)
# do some stuff and return a positive integer
return round(Int, abs(x))
end
function bar(x, y)
# do some stuff and return a positive integer
return round(Int, abs(x + y))
end
function process_data(data)
arr = [0, length(data)]
for x in data
push!(arr, foo(x))
end
return arr # Added return
end
function process_array(arr)
new_arr = Int64[] # Removed the Vector: Vector{Int64}[] is an empty Vector{Vector{Int64}}
N = length(arr)
for i in 1:N-1
push!(new_arr, bar(arr[i], arr[i+1]))
end
return new_arr
end
function all_together_now(data)
intermediate_results = process_data(data)
println(intermediate_results)
final_results = process_array(intermediate_results)
end
function all_together_now_no_print(data)
intermediate_results = process_data(data)
final_results = process_array(intermediate_results)
end
I have been trying to trim down all the fat this morning to post a MWE and I think I have found my problem. I had an @distributed without an @sync; I don’t totally understand how @distributed functions but as I understood the documentation, code so marked will not wait for all processes to complete before moving on unless @sync is prepended, so I guess my print statement was maybe buying just enough time for the job to finish before moving on to the next task? Adding the @sync has resolved the issue. Should I delete this thread? Clearly my example code wasn’t clear nor was it capturing the issue I was having…
I’m not sure you can even delete a thread with multiple posts (by different people)? In any case, I would advise against this. Other people might later encounter the same issue and find certain parts of the discussion here relevant, in particular your conclusion concerning @distributed.
You could add an edit to the topic title and/or first post, to mention Distributed.jl or @sync. Also, you could mark your last post as solution, so that it automatically shows up at the bottom of the first post and anyone with similar issues can jump directly to it. (Or, if possible, you could expand upon this last post and provide a MWE with @distributed with/without @sync, println, and mark that as solution.)
Probably you just needed to yield to other asynchronous tasks in order to avoid a deadlock, since it sounds like you may be using cooperative asynchronous tasks with a single physical thread (i.e. “green” threading or “coroutines”). @sync also yields time while it waits. You could also include an explicit yield() call.
If you are using I/O functions like println, then they effectively include an implicit yield() that allows other cooperative threads to run (since I/O in Julia is asynchronous).
Last I checked, this was only the case for printing to one of the std* streams, which is handled by libuv. IIRC this is not the case when printing to a file directly, which doesn’t go through libuv but rather the custom IOStream type that’s buffering internally.
Thanks for the helpful comments to everyone who replied! After some trial and error I found my problem was in my use of the @distributed macro without an @sync. Below is a MWE highlighting the behavior I was encountering.
using Distributed
function sync_foo(T)
arr = [1,T]
@sync @distributed for i in 2:T-1
push!(arr, i)
end
return arr
end
function foo(T)
arr = [1,T]
@distributed for i in 2:T-1
push!(arr, i)
end
return arr
end
function foo_printer(T)
arr = foo(T)
println(arr)
return arr
end
function foo_noprinter(T)
arr = foo(T)
return arr
end
function sync_foo_printer(T)
arr = sync_foo(T)
println(arr)
return arr
end
function sync_foo_noprinter(T)
arr = sync_foo(T)
return arr
end
T = 100000
arr = foo_printer(T)
l1 = size(arr)
arr = foo_noprinter(T)
l2 = size(arr)
arr = sync_foo_printer(T)
l3 = size(arr)
arr = sync_foo_noprinter(T)
l4 = size(arr)
println(l1,l2,l3,l4) # (100000,), (2,), (100000,), (100000,)