Loop slows things down drastically

OK, this’ll be a bit vague, as I’m working on some unpublished code, but maybe someone can still spot what’s going on…

I have a testset that looks something like thus:

@testset begin
    x = Something()
    # various functions with x as parameter
    # various tests
end

(The code involves building a mixed-integer program based in part on x, and then solving it using Cbc.)

Now I’d like to use the same test with more types for x. I started with a minor refactor:

@testset begin
    for x in [Something()]
        # various functions with x as parameter
        # various tests
    end
end

I thought this ought to be about as fast (until I added other values – especially of other types), but for some reason it went from about 0.12 s to 0.27 s.

I also tried swapping the begin/end of the testset with an equivalent loop, but that ended up at about 0.30 s.

There’s no random test data generated or anything, which could have caused the model to be harder to solve for Cbc, for example, as far as I can see.

This is quite slow (surprisingly so, even) to begin with, and so I’d rather avoid doubling it :-} Any thoughts – if nothing else, then on how to debug/figure out what’s going on?

(Using 1.5.0-rc1.0 on macOS Catalina.)

1 Like

I conjecture that more info on the Something part will be needed to be more helpful. But have you profiled the code?

And, just to make sure, you are using BenchmarkTools to time it?

And finally, you say you are on 1.5.0-rc1.0 (I wonder why). Do you get similar results using the “final” 1.5.x releases?

1 Like

Could you post what Something() is? or atleast check @code_warntype for stability and @btime for allocations?

Interestingly, the same thing happens if I just introduce an irrelevant loop:

@testset begin
    x = Something()
    for i = 1:1
        # various functions with x as parameter
        # various tests
    end
end

Or:

@testset begin
    for i = 1:1
        x = Something()
        # various functions with x as parameter
        # various tests
    end
end

Sooo … I guess I’ll have to strip things down a bit and produce a minimal example after all, to figure out what’s going on, then.

@hendri54: I haven’t done any deep-dive investigation yet – no profiling. I guess I’ll do that. Yes, the effect occurs when using BenchmarkTools. (No real reason for using the RC; just haven’t updated, and included as metadata. Tried with 1.5.3, and the results are the same.

@affans: It’s essentially just a wrapper around a LightGraphs graph. But it seems that’s not where the issue is, anyway (given the above). Not sure which part to apply @code_warntype to, without a bit of rewriting (which I could certainly try to do). (Nothing detected if I apply it directly to the @testset, but that may not be all that informative.)

But you’re absolutely onto something with looking at the allocations from @btime. Without a loop, the allocations are 5.91 MiB; with each of the loops in this post, it’s 18.21 MiB, for some reason.

Hmm. I think it may at least in part be due to be differences in compile time…? I thought that wouldn’t be the case when using @btime – as it’d run the thing multiple times, and it would already be compiled when It’s run again, but I realize that I’ve been applying @btime to the running of the test script itself (i.e., include in the REPL), and I guess there’s no avoiding running the compiler again, then. And I suppose that for some reason the compiler has to work a lot harder with that loop in the picture?

Simple test of this (with much lower times, of course, since I’m dropping everything JuMP-related):

using Test
using BenchmarkTools

@btime @testset begin
    @testset begin
        for i = 1:1
            println("Hello")
        end
    end
end

With loop, @btime here:
332.680 μs (135 allocations: 45.56 KiB)
Without loop, @btime here:
334.635 μs (135 allocations: 45.56 KiB)
With loop, @btime around include:
31.266 ms (36198 allocations: 2.03 MiB)
Without loop, @btime around incude:
3.674 ms (3934 allocations: 289.77 KiB)

My workflow has so far been using Revise (not for the timing tests, but otherwise), and then using include with tests/runtests.jl – since the standard test framework is designed to be run like a script (or included in the REPL). However, I guess that’s not such a great idea after all, then … I now wrapped my tests in a function, and applied @btime to that.

These results are with the original code.

With the loop:
89.948 ms (60092 allocations: 4.86 MiB)
Without the loop:
85.395 ms (59696 allocations: 4.82 MiB)

Not only are the runs much closer – they are also obviously a lot faster. (Won’t help on the first run, but with Revise and repeatedly running them to check my code while refactoring, it’s still be an improvement.) So, yeah. I guess the moral of the story (for me, anyway), is to not have my @testsets on the top level. (I could also minimize the code inside them, putting it in functions – but the code here was just two function calls to begin with!)

I always wrap testsets in functions, just to avoid side effects (and so I can easily run a single testset from the REPL). Then @btime test1() is also feasible.