I use a generated function for fast evaluation of monomials and their gradients. The following example could be easily rewritten without a generated function, but I don’t see how to easily implement its gradient without @generated
. But here the problem I have is easier to see:
using StaticArrays, BenchmarkTools
@generated function monomial(α, x::SVector{K}) where {K}
# @assert length(α) >= K
ex = "@fastmath "
for i = 1:K
ex *= "x[$i]^α[$i] * "
end
ex = ex[1:end-3]
quote
$(parse(ex))
end
end
α3 = (1,2,3)
x3 = @SVector rand(3)
@btime monomial($α3, $x3) # 15.546 ns (0 allocations: 0 bytes)
α6 = (1,2,3,2,5,2)
x6 = @SVector rand(6)
@btime monomial($α6, $x6) # 2.037 μs (29 allocations: 528 bytes)
is the length of x is 1,2,3,4,5 then there is no allocation, if it is 6, then there is an allocation. Does anybody have an idea what is going on here?
EDITS:
- if I turn off the
@fastmath
macro then the allocation goes to zero! But I really need the@fastmath
, it gives me a factor 5-10 performance gain here. - If I apply the
@fastmath
to each individualx[I]^a[I]
then the allocation goes down to zero as well, but when the length of the vector becomes 8, then it start allocating again