Hi,
While multi-threading some code, I came across an odd case where spawning tasks leads to Julia boxing a variable in a different code section (leading to type instabilities, Any
, bad performance, the whole shebang).
I managed to reduce it to this contrived MWE:
function kernel!(clusters, points, irange)
# Pseudo-cluster assignment
for i in irange[1]:irange[2]
clusters[i] = rand(1:length(points))
end
end
function mtbox(points, num_tasks)
num_points = size(points, 2)
clusters = similar(points, Int64, num_points)
prev_clusters = similar(points, Int64, num_points)
# Keep track of tasks spawned
tasks = Vector{Task}(undef, num_tasks)
for it in 1:50
# Swap current and previous iteration's cluster assignments
clusters, prev_clusters = prev_clusters, clusters
for itask in 1:num_tasks
# Compute element indices handled by this task
per_task = (num_points + num_tasks - 1) ÷ num_tasks
task_istart = (itask - 1) * per_task + 1
task_istop = min(itask * per_task, num_points)
# Launch task over computed index range
tasks[itask] = Threads.@spawn kernel!(
clusters,
points,
(task_istart, task_istop),
)
end
for task in tasks
wait(task)
end
end
clusters
end
# Example usage
mtbox(rand(3, 10), 4)
Using Cthulhu.@descend mtbox(rand(3, 10), 4)
, we immediately see the problem (attached as a screenshot to highlight colours):
We get a clusters::Core.Box
which leads to clusters, prev_clusters::Any = prev_clusters, clusters::Any
.
However, if we remove the clusters, prev_clusters = prev_clusters, clusters
line, everything becomes type-stable again:
In isolation, neither of these lead to type instabilities:
- Swapping variable names with
clusters, prev_clusters = prev_clusters, clusters
. - Launching tasks using
clusters
as a function argument.
But together, we get this sort of spooky action at a distance where the types cannot be inferred in one part of the code due to another.
I do not have much experience with type inference - would someone know why this happens, and perhaps if there are any solutions to it?
Thank you,
Leonard