When writing a method I used some child (internal?) functions to avoid copy/pasting code that was used repeatedly. The methods are not that complicated so I figured the compiler would just inline them and there would be no overhead and the code would be easier to read. What I saw was memory allocations going from 0 to multiple K with the associated drop in performance (Julia 1.5.3 on Linux).
using BenchmarkTools
function test(data, offset)
avail = ()->length(data) - offset + 1
block = ()->reinterpret(UInt32, view(data, offset:(offset+8)))
total = 0
while offset < length(data)
total += data[offset]
offset += 1
end
return total
end
@benchmark test(d, 1) setup=(d=rand(Int, 1024))
What I would like to do is use the avail()
and block()
methods without the performance hit of memory allocations. For example changing the while loop to be while avail() > 0
so no callbacks or anything complicated. Is there another way to abstract that code out? I feel like a macro (if you can even do macros in a function scope) would be overkill, but maybe that’s the way to go?
If you run the above example you will see results like:
BenchmarkTools.Trial:
memory estimate: 55.98 KiB
allocs estimate: 3583
--------------
minimum time: 101.712 μs (0.00% GC)
median time: 104.855 μs (0.00% GC)
mean time: 108.568 μs (1.06% GC)
maximum time: 1.061 ms (87.64% GC)
--------------
samples: 10000
evals/sample: 1
Running Julia with --track-allocation
show that the allocations are all around the loop:
- total = 0
448806912 while offset < length(data)
896737248 total += data[offset]
224841744 offset += 1
- end
If you comment out avail
and block
which are not used, you will see the allocations go to 0 and a sizable increase in performance.