I have a function that can either execute quickly or take a very long time. I want to kill the function after a certain amount of time so I can fall back on a more reliable method. I should be able to do this using a macro. The problem is I am a bit out of my depth. I took info from here and here and this is what I have so far
macro timeout(time,f)
quote
t = Task($f)
schedule(t)
Timer(x -> (istaskdone(t) || Base.throwto(t,InterruptException())),$time)
t
end
end
The idea being I can just do
a = @timeout 5 short_or_long(args)
but my implementation is woefully wrong. Could anyone help?
Forgive my ignorance, but what does it mean to not explicitly yield control. Does this mean that a task that does explicitly yields control can be interrupted?
The way I understand it, a Julia task is it’s own master. If it calls yield (or sleep, or wait or some other API that ends up calling one of those), then it explicitly hands control to the Julia runtime. At all other times, a Julia task just executes compiled machine code like a compiled C program. i.e. there is no supervisor, or interpreter, or master thread, or any opportunity for preempting a Julia task.
Does this mean that a task that does explicitly yields control can be interrupted?
Yes
julia> t = @async try while true sleep(1) ; println("tick") ; end catch e println("stopped on $e") endtick
tick
tick
tick
julia> @async Base.throwto(t, EOFError())
stopped on EOFError()
Task (runnable) @0x000000011555b610
julia> t
Task (done) @0x000000011555b3d0
Thanks, this was super helpful! I found two problems with this approach though. First, the return confusingly just terminates the expression; it doesn’t allow you to assign a variable to the output of the task (compare to x = return 4). Second, when you queue up multiple tasks in a row, the Timer from the old task is still active and will end up throwing the interrupt exception to the main program (not sure why that happens). Here’s my version, which fixes these issues (also switches the argument order)
macro timeout(seconds, expr)
quote
tsk = @task $expr
schedule(tsk)
Timer($seconds) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
fetch(tsk)
end
end
x = @timeout 1 begin
sleep(0.5)
println("done")
1
end
@assert x == 1
I wanted something very similar but with the possibility of a default value in case of failure. So I came up with this:
macro timeout(seconds, expr, fail)
quote
tsk = @task $expr
schedule(tsk)
Timer($seconds) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
try
fetch(tsk)
catch _
$fail
end
end
end
x = @timeout 1 begin
sleep(1.1)
println("done")
1
end "failed"
I have tried all day long to replicate @hhaensel’s snippet above considering some arguments to be given to the function. Since I know Julia’s @task doesn’t allow any function to be called with arguments, I have tried fixing the latter by calling the function with the variable. However, nothing happens.
My main hypothesis is that i is interpreted by the macro as a symbol and not as a value. Is there a trick to overcome this problem in non-interactive mode?
Here is my minimal (non)-working example:
function test()
f(x) = begin 2*x ; println(2*x) ; sleep(1.) end
#println("Testing f: f(3)")
for i in 1:10
@timeout 10 f(i) "fail"
#println(i)
end
end
I dived more deeply in the @timeout macro by removing the try catch_ end statement and just leaving fetch(tsk) instead. Now, I get the following error:
julia> test()
ERROR: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:345 [inlined]
[2] fetch
@ ./task.jl:360 [inlined]
[3] test()
@ Main ~/Documents/GIT/GravityMachine/src/solveSPAexactly/solveSPA.jl:28
[4] top-level scope
@ REPL[7]:1
nested task error: UndefVarError: f not defined
Stacktrace:
[1] (::var"#170#173")()
@ Main ./task.jl:134
When I bring f out of test (so that f becomes global), Julia yields the same error but with i instead of f:
nested task error: UndefVarError: i not defined
Stacktrace:
[1] (::var"#187#189")()
@ Main ./task.jl:134
I feel that my problem is related to a misunderstanding of meta-programming. In my opinion I am trying to evaluate the code before parsing it (which is impossible?!).
It seems what i was looking for is the following function (and not a macro):
function timeout(f, arg, seconds, fail)
tsk = @task f(arg...)
schedule(tsk)
Timer(seconds) do timer
istaskdone(tsk) || Base.throwto(tsk, InterruptException())
end
try
fetch(tsk)
catch _;
fail
end
end
I’ve ran into issues with this one when using it in tasks where the task mostly completes within seconds but sometimes takes minutes with good reason, i.e. having a long timeout.
It would essentially flood the scheduler with many timers that were waiting to finish.
I changed it a bit to this, and it seems to not run into the same issues anymore:
macro timeout(seconds, expr, err_expr=:(nothing))
esc(quote
tsk__ = @task $expr
schedule(tsk__)
start_time__ = time()
curt__ = time()
Base.Timer(0.001, interval=0.001) do timer__
if tsk__ === nothing || istaskdone(tsk__)
close(timer__)
else
curt__ = time()
if curt__ - start_time__ > $seconds
Base.throwto(tsk__, InterruptException())
end
end
end
try
fetch(tsk__)
catch err__
if err__.task.exception isa InterruptException
RemoteHPC.log_error(RemoteHPC.StallException(err__))
$err_expr
else
rethrow(err__.task.exception)
end
end
end)
end