I have a function that can either execute quickly or take a very long time. I want to kill the function after a certain amount of time so I can fall back on a more reliable method. I should be able to do this using a macro. The problem is I am a bit out of my depth. I took info from here and here and this is what I have so far
macro timeout(time,f)
quote
t = Task($f)
schedule(t)
Timer(x -> (istaskdone(t) || Base.throwto(t,InterruptException())),$time)
t
end
end
The idea being I can just do
a = @timeout 5 short_or_long(args)
but my implementation is woefully wrong. Could anyone help?
2 Likes
I don’t believe there is a way to interrupt a Julia task that does not explicitly yield
control.
https://github.com/JuliaLang/julia/issues/6283
In HTTP.jl we have a timeout task that closes the network connection after a timeout. This causes the main task to abort with an EOF error next time it tries to use the connection. https://github.com/JuliaWeb/HTTP.jl/blob/master/src/TimeoutRequest.jl#L20-L27
Forgive my ignorance, but what does it mean to not explicitly yield control. Does this mean that a task that does explicitly yields control can be interrupted?
The way I understand it, a Julia task is it’s own master. If it calls yield
(or sleep
, or wait
or some other API that ends up calling one of those), then it explicitly hands control to the Julia runtime. At all other times, a Julia task just executes compiled machine code like a compiled C program. i.e. there is no supervisor, or interpreter, or master thread, or any opportunity for preempting a Julia task.
https://github.com/JuliaLang/julia/issues/25353#issuecomment-354879008
Does this mean that a task that does explicitly yields control can be interrupted?
Yes
julia> t = @async try while true sleep(1) ; println("tick") ; end catch e println("stopped on $e") endtick
tick
tick
tick
julia> @async Base.throwto(t, EOFError())
stopped on EOFError()
Task (runnable) @0x000000011555b610
julia> t
Task (done) @0x000000011555b3d0
1 Like
I’ll just leave my approach here…
macro timeout(expr, seconds=-1, cb=(tsk) -> Base.throwto(tsk, InterruptException()))
quote
tsk = @task $expr
schedule(tsk)
if $seconds > -1
Timer((timer) -> $cb(tsk), $seconds)
end
return fetch(tsk)
end
end
julia> @timeout (sleep(3); println("done")) 3.1
done
julia> @timeout (sleep(3); println("done")) 3
ERROR: TaskFailedException:
InterruptException:
Stacktrace:
[1] try_yieldto(::typeof(Base.ensure_rescheduled)) at ./task.jl:656
[2] wait at ./task.jl:713 [inlined]
[3] wait(::Base.GenericCondition{Base.Threads.SpinLock}) at ./condition.jl:106
[4] _trywait(::Timer) at ./asyncevent.jl:110
[5] wait at ./asyncevent.jl:128 [inlined]
[6] sleep at ./asyncevent.jl:213 [inlined]
[7] (::var"#29#31")() at ./task.jl:112
Stacktrace:
[1] wait at ./task.jl:267 [inlined]
[2] fetch(::Task) at ./task.jl:282
[3] top-level scope at REPL[3]:10
2 Likes
Thanks, this was super helpful! I found two problems with this approach though. First, the return
confusingly just terminates the expression; it doesn’t allow you to assign a variable to the output of the task (compare to x = return 4
). Second, when you queue up multiple tasks in a row, the Timer from the old task is still active and will end up throwing the interrupt exception to the main program (not sure why that happens). Here’s my version, which fixes these issues (also switches the argument order)
macro timeout(seconds, expr)
quote
tsk = @task $expr
schedule(tsk)
Timer($seconds) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
fetch(tsk)
end
end
x = @timeout 1 begin
sleep(0.5)
println("done")
1
end
@assert x == 1
8 Likes
Thanks for this useful snippet.
I wanted something very similar but with the possibility of a default value in case of failure. So I came up with this:
macro timeout(seconds, expr, fail)
quote
tsk = @task $expr
schedule(tsk)
Timer($seconds) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
try
fetch(tsk)
catch _
$fail
end
end
end
x = @timeout 1 begin
sleep(1.1)
println("done")
1
end "failed"
4 Likes
this is amazing guys. thanks!
I have tried all day long to replicate @hhaensel’s snippet above considering some arguments to be given to the function. Since I know Julia’s @task
doesn’t allow any function to be called with arguments, I have tried fixing the latter by calling the function with the variable. However, nothing happens.
My main hypothesis is that i
is interpreted by the macro as a symbol and not as a value. Is there a trick to overcome this problem in non-interactive mode?
Here is my minimal (non)-working example:
function test()
f(x) = begin 2*x ; println(2*x) ; sleep(1.) end
#println("Testing f: f(3)")
for i in 1:10
@timeout 10 f(i) "fail"
#println(i)
end
end
I apologize for my poor English ^^’
I dived more deeply in the @timeout
macro by removing the try catch_ end
statement and just leaving fetch(tsk)
instead. Now, I get the following error:
julia> test()
ERROR: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:345 [inlined]
[2] fetch
@ ./task.jl:360 [inlined]
[3] test()
@ Main ~/Documents/GIT/GravityMachine/src/solveSPAexactly/solveSPA.jl:28
[4] top-level scope
@ REPL[7]:1
nested task error: UndefVarError: f not defined
Stacktrace:
[1] (::var"#170#173")()
@ Main ./task.jl:134
When I bring f
out of test
(so that f
becomes global), Julia yields the same error but with i
instead of f
:
nested task error: UndefVarError: i not defined
Stacktrace:
[1] (::var"#187#189")()
@ Main ./task.jl:134
I feel that my problem is related to a misunderstanding of meta-programming. In my opinion I am trying to evaluate the code before parsing it (which is impossible?!).
It seems what i was looking for is the following function (and not a macro):
function timeout(f, arg, seconds, fail)
tsk = @task f(arg...)
schedule(tsk)
Timer(seconds) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
try
fetch(tsk)
catch _;
fail
end
end
I’ve ran into issues with this one when using it in tasks where the task mostly completes within seconds but sometimes takes minutes with good reason, i.e. having a long timeout.
It would essentially flood the scheduler with many timers that were waiting to finish.
I changed it a bit to this, and it seems to not run into the same issues anymore:
macro timeout(seconds, expr, err_expr=:(nothing))
esc(quote
tsk__ = @task $expr
schedule(tsk__)
start_time__ = time()
curt__ = time()
Base.Timer(0.001, interval=0.001) do timer__
if tsk__ === nothing || istaskdone(tsk__)
close(timer__)
else
curt__ = time()
if curt__ - start_time__ > $seconds
Base.throwto(tsk__, InterruptException())
end
end
end
try
fetch(tsk__)
catch err__
if err__.task.exception isa InterruptException
RemoteHPC.log_error(RemoteHPC.StallException(err__))
$err_expr
else
rethrow(err__.task.exception)
end
end
end)
end
I successfully reached the solution i was looking for by using the esc(.)
function. Then, I get the following snippet:
macro timeout(seconds, expr, fail)
quote
tsk = @task $esc(expr)
schedule(tsk)
Timer($(esc(seconds))) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
try
fetch(tsk)
catch _
$(esc(fail))
end
end
end
2 Likes