Here is a smaller reproducible example showing the bug:
using Statistics
using Base.Threads: @threads
X = rand(Int(1.25e6), 100)
function get_edges(X::AbstractMatrix{T}, nbins=250) where {T}
edges = Vector{Vector{T}}(undef, size(X,2))
@threads for i in 1:size(X, 2)
# for i in 1:size(X, 2)
edges[i] = quantile(view(X, :,i), (1:nbins)/nbins)
if length(edges[i]) == 0
edges[i] = [minimum(view(X, :,i))]
end
end
return edges
end
println("num threads: ", Threads.nthreads())
println("trial 1: ")
edges = get_edges(X, 128);
println("trial 2: ")
edges = get_edges(X, 128);
println("trial 3: ")
edges = get_edges(X, 128);
println("trial 4: ")
edges = get_edges(X, 128);
Script is then run from CLI, with julia num threads set to 4:
C:\Evovest\EvoTrees.jl\experiments>julia thread_bug.jl
num threads: 4
trial 1:
trial 2:
Assertion failed: new_time >= loop->time, file src/win/core.c, line 105
signal (22): SIGABRT
in expression starting at C:\Evovest\EvoTrees.jl\experiments\thread_bug.jl:23
crt_sig_handler at /cygdrive/d/buildbot/worker/package_win64/build/src\signals-win.c:92
raise at C:\WINDOWS\System32\msvcrt.dll (unknown line)
abort at C:\WINDOWS\System32\msvcrt.dll (unknown line)
assert at C:\WINDOWS\System32\msvcrt.dll (unknown line)
uv_update_time at /workspace/srcdir/libuv\src/win\core.c:105
uv_run at /workspace/srcdir/libuv\src/win\core.c:371
jl_process_events at /cygdrive/d/buildbot/worker/package_win64/build/src\jl_uv.c:214
jl_task_get_next at /cygdrive/d/buildbot/worker/package_win64/build/src\partr.c:520
poptask at .\task.jl:704
wait at .\task.jl:712 [inlined]
task_done_hook at .\task.jl:442
jl_apply at /cygdrive/d/buildbot/worker/package_win64/build/src\julia.h:1690 [inlined]
jl_finish_task at /cygdrive/d/buildbot/worker/package_win64/build/src\task.c:198
start_task at /cygdrive/d/buildbot/worker/package_win64/build/src\task.c:717
Allocations: 1298065 (Pool: 1297660; Big: 405); GC: 3
If the @threads is removed from the loop, then there’s no crash to report.
With @threads, it crashes non deterministically, after 2, 3 or more iterations.
This issue on libuv seems to be tied: https://github.com/libuv/libuv/issues/1633, as well as this associated patch: https://github.com/libuv/libuv/commit/796744869669842bd5405a71de8ba60b1556fc24.
Is Julia using a Windows system libuv or should it be using its own? Ie: is it more likely that I should consider a Windows resintall or it may be about the libuv shipped with Julia (if any)?