Reducing getfield overhead?

A MWE:

using BenchmarkTools

struct Get_Field_Test
    x::Vector{Float64}
    y::Vector{Float64}
end

ts = Vector{Get_Field_Test}(undef, 1000)
for i in 1:1000
    ts[i] = Get_Field_Test(rand(100), rand(100))
end

function test1(ts)
    for i in 1:1000
        @inbounds ts[i].x .+= ts[i].y
    end
end
function test2(ts)
    for i in 1:1000
        for j = 1:100
            @inbounds ts[i].x[j] += ts[i].y[j]
        end
    end
end

@btime test1($ts)

@btime test2($ts)

The testing results

31.250 μs (0 allocations: 0 bytes)
105.769 μs (0 allocations: 0 bytes)

The overhead of the second test is annoying. I’m wondering if there’s a way to lower it?

1 Like

By eliminating common subexpressions:

function test3(ts)
    for i in 1:1000
        let tsix = ts[i].x, tsiy = ts[i].y
            for j = 1:100
                @inbounds tsix[j] += tsiy[j]
            end
        end
    end
end

But this is essentially the same the test1 anyways.

4 Likes

The loads ts[i].x and ts[i].y get hoisted out of the loop, but not the checks vs nullptr. That is, the isassigned check for throwing a UndefRefError is the problem.

5 Likes