Is there a way to break a function after a time limit?
I don’t have any access to the function.
FYI (if it helps) I want to break a JuMP solve after some seconds without knowing which solver performs the task.
(the solver is chosen by the user)
The best would be to have try catch block where I can somehow specify the time limit in.
Thanks. If I build the model using JuMP and the solver doesn’t have internalmodel implemented I have no access to the MathProgBase level right? Seems like GLPK has the setparameters! function but not the internalmodel.
I am stuck on the same problem without a clear general solution.
Basically, given a function f(x), how to wrap it in a function g(x) that waits say 1min, and if f(x) is still running, stops the execution and returns missing for example.
I still don’t get it. How the code you shared solves the issue? If the function is not called asynchronously, it locks the execution, and so we can’t be watching in a while loop, right?
Can you please show an example where you ask for an execution of say 10min duration (without knowing it will take 10min), but then you kill the execution in 10s?
Getting timeout (or cancellation in general) right is the main theme in Structured Concurrency https://github.com/JuliaLang/julia/issues/33248. But we don’t have an out-of-the-box solution to this right now. You have to pass around some kind of “cancellation token” for this to work. But it’d mean it’s more or less equivalent to Tamas_Papp’s code:
function f(x, should_stop)
while true
sleep(0) # need some I/O
should_stop[] && return missing
x += 1
end
x
end
@sync begin
should_stop = Threads.Atomic{Bool}(false) # token
t = @async f(0, should_stop)
sleep(0.5)
should_stop[] = true
fetch(t)
end
I could be wrong but I think what @juliohm is asking for is to be able to stop a black box function that won’t necessarily ever yield. So to be able to take any function f, whose internals we don’t have access to and wrap it in a function runtime_limiter like so:
result = runtime_limiter(f, timeout=60)
So in that example, if f is still running after 60 time units, then runtime_limiter returns missing.
In this case could you do something like
time_from_now(seconds) = round(Int, 10^9 * seconds + time_ns())
function runtime_limiter(f::Function, args...; kwargs..., timeout=60)
t = @async f(args; kwargs)
end_time = time_from_now(timeout)
result = missing
while time_ns() <= end_time
pause(0.1)
if isready(t)
result = fetch(t)
break
end
end
return result
end
Note that I haven’t tested the above. Don’t know if it would actually work.
Edit: In the case that the function did not complete before timeout, I don’t know how you would kill the running task.
Ok. I think I misunderstood your reply @tkf. So, to check that I understand you correctly, you’re saying that as far as you know there is no way (or at least no straightforward way) to kill a running task from within the same julia process as the task?
I hadn’t thought of the ability to kill a task as being related to structured concurrency. But upon reflection I see that it is. Again, to check my understanding, the problem is that in unstructured concurrency, all the tasks live in some kind of global scope with no way to trace how they relate to each other. So if you create a task that creates other secondary tasks, then when you want to kill the primary task, there is no automatic way to distinguish between secondary tasks that need to be cleaned up and other tasks that are completely unrelated.
Of course if you’re writing everything yourself you could have a channel that holds all the tasks for a given computation and pushes into it when new subtasks are created but that doesn’t work as soon as you have a black box function that creates its own new tasks.
Thanks for the reply and for helping to improve my understanding @tkf
@juliohm, if you’re willing to fire up a new julia process to run your long running function (and move the input data over to that new process) then there’s a nice solution over at StackOverflow:
The idea is to start your computation asynchronously on a remote worker and create an empty RemoteChannel for the result. Then back on your master process you have a loop that calls isready on the RemoteChannel until the timeout is reached. If the timeout is reached and the RemoteChannel is still empty you just use rmprocs to kill the remote worker.
It’s a requirement for implementing structured concurrency.
I think you are describing so-called black box principle (i.e., after a function returns or throws, all tasks it spawned should not be running in background). Task cancellation is another building block of structured concurrency as you’d need a way to terminate other tasks to enforce black box principle and rapidly terminating a function call.
Yeah, I agree that putting it in a process is a nice way to robustly cancel the computation (if you are OK with the overhead of a remote call).
I have the same problem. Given a call that I am not interested in the return (only its terminal output is of my interest), I want to make sure it will run unimpeded by X seconds (i.e., without considerable overhead), and then, if it is not finished at the end of this time, it is “stopped by force” and the program flow continues normally by executing the next line after this call. My only other constraint is that it does not compile again methods already called before (I am not sure if solutions that spawn a new process guarantee that or not, I am very ignorant in parallel/distributed computing, specially with Julia).
I looked at JuliaObserver but failed to find anything that seems to solve my problem. Then, I looked at Github and found three repositories, one of them is ancient, and the other two are both called Timeout.jl, one last updated in 2018 and written by @ararslan and another last updated five months written by @goropikari.
This most recent package seems to just create a task with a sleep(time_limit) and another task with the method you want to call, if the first task returns first it kills the second, if the second returns first it kills the timer. Does this works for my purposes? The inner calls will make use of previously compiled methods (i.e., methods already called with the same parameter types), or does this run in an entire other process and re-compiles everything it needs there?
That is exactly what I need actually @Pbellive, the original idea was to use distributed processes as opposed to tasks or threads given that my function is quite expensive.
However, I would like to ask how we could adapt that SO answer to the following scenario. I have a pmap call:
results = pmap(xs, on_error=e->missing) do x
f(x)
end
It already takes care of the problematic iterations that throw errors. Now I want to add the time limit feature as we’ve been discussing. Is it possible to do it together with pmap or I need to reimplement the parallel map functionality by hand by calling remotecall on the pool?
I will give it a try, but if anyone has experience in this topic, please feel free to go ahead and share some code snippet with the pmap + time limit functionality. I need this to speed up an experiment for a paper.
results = pmap(xs, on_error=e->missing) do x
push!(chan, (:started, myid()))
try
f(x)
finally
push!(chan, (:finished, myid()))
end
end
where chan is a RemoteChannel. You can then do something like
function killloop(timeout, chan)
timers = Dict{Symbol,Timer}()
for (event, id) in chan
if event === :started
timers[id] = Timer(timeout) do _
try
rmprocs(id)
finally
pop!(timers, id, nothing)
end
end
elseif event === :finished
t = pop!(timers, id, nothing)
if t !== nothing
close(t)
end
end
end
end
to stop the process (run this in @async). Though I’m not sure how this interacts with pmap.
I haven’t updated it because I didn’t know people actually used it. Tbh I haven’t used it more than once or twice for anything substantial. It should be compatible with at least 0.6 and 1.0, but doesn’t have appropriate compatibility bounds set in Project.toml.
It takes the approach of running the function call in a remote process and forcibly killing the process if the process doesn’t yield after interrupts. That’s sort of like using a sledge hammer as a fly swatter, but I couldn’t think of anything better at the time, since an interrupted Task may not yield.
PRs of any kind for that package are of course welcome!