Hmm. I think it may at least in part be due to be differences in compile time…? I thought that wouldn’t be the case when using @btime – as it’d run the thing multiple times, and it would already be compiled when It’s run again, but I realize that I’ve been applying @btime to the running of the test script itself (i.e., include in the REPL), and I guess there’s no avoiding running the compiler again, then. And I suppose that for some reason the compiler has to work a lot harder with that loop in the picture?
Simple test of this (with much lower times, of course, since I’m dropping everything JuMP-related):
using Test
using BenchmarkTools
@btime @testset begin
@testset begin
for i = 1:1
println("Hello")
end
end
end
With loop, @btime here:
332.680 μs (135 allocations: 45.56 KiB)
Without loop, @btime here:
334.635 μs (135 allocations: 45.56 KiB)
With loop, @btime around include:
31.266 ms (36198 allocations: 2.03 MiB)
Without loop, @btime around incude:
3.674 ms (3934 allocations: 289.77 KiB)
My workflow has so far been using Revise (not for the timing tests, but otherwise), and then using include with tests/runtests.jl – since the standard test framework is designed to be run like a script (or included in the REPL). However, I guess that’s not such a great idea after all, then … I now wrapped my tests in a function, and applied @btime to that.
These results are with the original code.
With the loop:
89.948 ms (60092 allocations: 4.86 MiB)
Without the loop:
85.395 ms (59696 allocations: 4.82 MiB)
Not only are the runs much closer – they are also obviously a lot faster. (Won’t help on the first run, but with Revise and repeatedly running them to check my code while refactoring, it’s still be an improvement.) So, yeah. I guess the moral of the story (for me, anyway), is to not have my @testsets on the top level. (I could also minimize the code inside them, putting it in functions – but the code here was just two function calls to begin with!)