How to make sure a task interrupts when it takes too long

arnerob · June 20, 2024, 4:19pm

So I’m trying to do some experiments on how long some algorithms take.
Whenever an experiment takes longer than some set time-out I want to stop it and go to another one.
The algorithms I’m trying to time are pretty involved and use IterTools, JuMP and CPLEX.
I tried to make a minimum working example, but the code below works fine for me.

using BenchmarkTools

function runBenchmark(bench, timeout=10)
    task = @async run(bench)
    elapsed_time = 0.0
    check_interval = 0.1  # Check every 0.1 seconds

    while elapsed_time < timeout
        sleep(check_interval)
        if istaskdone(task)
            return fetch(task)
        end
        elapsed_time += check_interval
    end

    @async Base.throwto(task, InterruptException())
    return nothing
end

for k in 15:-5:5
  bench=@benchmarkable sleep($k)  
  println(runBenchmark(bench))
end

The problem is that when I replace bench by the complicated algorithm, then it doesn’t interrupt it for some reason. It returns the output with a timed value above the time-out. I’m trying to understand how this happens and find and how to fix it.

The algorithm I’m testing sometimes takes long time because it iterates over a large cartesian product (using IterTools.product) and computes a linear program for each of them.

I guess I could write a small bash script with “timeout” that runs julia scripts, but maybe understanding the problem would help me find a cleaner solution?

Satvik · June 20, 2024, 4:42pm

If you want to interrupt a task, it needs yield() statements somewhere. sleep already has them, so it’s interruptible.

For example, this can’t be interrupted:

function foo()
       i = 0
       while true
           i += 1
       end
end

@async foo()

But this can:

function foo()
       i = 0
       while true
           i += 1
           yield()
       end
end
@async foo()

Per the documentation, yield will “Switch to the scheduler to allow another scheduled task to run. A task that calls this function is still runnable, and will be restarted immediately if there are no other runnable tasks.”