I searched for running a task with a timeout, but all the answers I found used async tasks which won’t work because the whole point is that the code doesn’t yield.
I understand that killing a thread is generally bad, but I ran into a particular case that seems justified: if there’s a large calculation in code that never yields and you can’t modify the code to yield reasonably (eg. a library). You need to abort it if it takes too long (eg. so it doesn’t hang your repl such that even ctrl+c won’t stop it, which happened to me today).
I found one threaded example, and that appears to do what I need. But notice the example sleeps: which means it yields nicely. What about something that doesn’t yield? It crashes the repl apparently:
Threads.nthreads() # output: 24
function run_timeout(f, seconds)
t = Threads.@spawn begin
try
f()
catch
# println("Thread interrupted")
end
end
Timer(seconds) do _ # also tried spawning this in separate thread
# Threads.@spawn Timer(seconds) do _
try
if !istaskdone(t)
println("Task did not complete in time. Aborting.")
# Base.throwto(t, InterruptException())
schedule(t, InterruptException(), error=true)
else
println("Task completed within seconds.")
end
catch
# println("Error around timer")
end
end
end
# This works fine and does not print 'done' which is expected.
run_timeout(() -> ( sleep(10) ; println("done") ), 1.0)
spin() = reduce((x, a) -> log(x+a)%12341234, 1:100000000; init=1)
@time spin(); # output: 13.953269 seconds
# Here's the problem:
run_timeout(spin, 0.01);
#= output:
Task did not complete in time. Aborting. # displayed after 1 second
# REPL hangs for the same time spin() takes to run then it crashes with:
fatal: error thrown and no exception handler available.
20.7233
try_yieldto at .\task.jl:931
wait at .\task.jl:995
task_done_hook at .\task.jl:675
jfptr_task_done_hook_79050.1 at C:\app\dev\Julia\lib\julia\sys.dll (unknown line)
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
jl_finish_task at C:/workdir/src\task.c:320
start_task at C:/workdir/src\task.c:1249
=#
# REPL has crashed
I tried commenting out the println’s in case they might have been related, but they aren’t.
Particularly surprising is the REPL hung even when I spawned the Timeout call.
So, this tells me that the solution in the link above does not kill a task/thread. It schedules a task on that thread that won’t run until it yields, so it will do nothing for unyielding tasks. Crashing is also a problem.
Everything I’m reading about this type of problem seems to be about how to do it nicely, resource release concerns and such. But I’m talking about the case when a thread is bad, a misbehaving comprehensive consumer of your CPU, and I want to kill it with fire. I can understand there might be an argument that at that point maybe it would be better to kill the whole process and start from scratch. However, particularly in the Julia REPL world where it can take 5 minutes or more to get back to the same state due to compilation, data loading, etc., it would be nice to not have to do that.