SVectors + @reset from Accessors: Strange Benchmarks

If we get rid of the random number, can’t the compiler just calculate the result? I think that is what I am seeing when I make it simpler.

Results differ on my PC vs laptop (which are also different Julia versions…) and depending on the optimization flag.

Slimmed MWE
using Parameters, StaticArrays, Accessors, BenchmarkTools
using Distributions, Random 

@with_kw struct Str00{N}
    fat   :: SVector{N, Float64}
    sh_c0 :: SVector{N, Float64}
    sh_cm :: SVector{N, Float64}
end


Random.seed!(1234)
bg1 = Str00(fat   = SVector{16}(fill(Float64(1.0), 16)),
            sh_c0 = SVector{16}(rand(Uniform(4, 25), 16)),
            sh_cm = SVector{16}(rand(Uniform(4, 25), 16)))


bg2 = Str00(fat   = SVector{16}(fill(Float64(1.0), 16)),
            sh_c0 = SVector{16}(rand(Uniform(4, 25), 16)),
            sh_cm = SVector{16}(rand(Uniform(4, 25), 16)))


gt = (bg1, bg2)


function test1(gt)
    for t in 1:100
        @reset gt[1].sh_cm .= gt[1].sh_c0 .* gt[1].fat
        @reset gt[2].sh_cm .= gt[2].sh_c0 .* gt[2].fat
    end
    return gt
end

function test2(gt)
 for t in 1:100
    gt = map(gt) do x
        @reset x.sh_cm .= x.sh_c0 .* x.fat
        return x
    end
end
return gt
end


@btime  test1($gt);
@btime  test2($gt);


res1 = test1(gt);
res2 = test2(gt);

res1 == res2

System info

Laptop (i5-6300u Julia 1.10.4)
PC (2990wx Julia 1.11.0-rc-1)

Laptop

  • No flag, does not depend on outer loop length
julia> @btime  test1($gt);
  23.730 ns (0 allocations: 0 bytes)

julia> @btime  test2($gt);
  24.456 ns (0 allocations: 0 bytes)

Unless the -O1 optimization flag is used (but not -O2/-O3):

julia> @btime  test1($gt);
  12.909 μs (0 allocations: 0 bytes)

julia> @btime  test2($gt);
  12.497 μs (0 allocations: 0 bytes)

PC

  • No flag, test2 depends on outer loop
julia> @btime  test1($gt);
  24.407 ns (0 allocations: 0 bytes)

julia> @btime  test2($gt);
  6.592 μs (0 allocations: 0 bytes)

PC -O1 flag

julia> @btime  test1($gt);
  8.403 μs (0 allocations: 0 bytes)

julia> @btime  test2($gt);
  9.490 μs (0 allocations: 0 bytes)

PC -O2 (or -03) flag

julia> @btime  test1($gt);
  25.980 ns (0 allocations: 0 bytes)

julia> @btime  test2($gt);
  6.594 μs (0 allocations: 0 bytes)