Spurious memory allocation even though @code_warntype looks fine (no types marked in red)

#1

I have trouble with this piece of code:


struct Foo{T,N}
    a::Array{T,N}
    l::Int
end

function Base.iterate(foo::Foo{T,N})  where {T,N}
   v = Array{Float64,N}(undef,size(foo.a))
   k = 0
   if k == foo.l; return nothing end
   next = (foo.a,k)
   iteratekernel(next,v)
end

function Base.iterate(foo::Foo,state)
  v, k = state
  if k == foo.l; return nothing end
  next = (foo.a,k)
  iteratekernel(next,v)
end

function iteratekernel(next,v)
    a, k = next
    for i in 1:size(a,1)
        v[i] = a[i][2]
    end
    ((v,size(v)),(v,k+1))
end

function run(x)
    for xi in x
    end
end

t = (1.0,2.0,3.0)
arr = [t,t,t,t]
foo = Foo(arr,100000)

It seems to lead to spurious memory allocation:

@time run(foo)
@time run(foo)
  0.065006 seconds (149.80 k allocations: 5.529 MiB)
  0.005052 seconds (100.01 k allocations: 3.052 MiB, 64.92% gc time)

The output of

@code_warntype iterate(foo)
@code_warntype iterate(foo,iterate(foo)[2])

looks fine (no types marked in red).

Does somebody can clarify why it is not working properly?

Thanks in advance for the help!

#2

Have you tried to analyze it with Traceur.jl?

1 Like
#4

The statement

@trace run(foo)

runs silently.

The code passes all basic checks and still leads to spurious memory allocation. Does anybody has an explanation?

#5

Forcing inlining the functions (using `@inline) makes the allocations go away.

Tuples that wrap non isbitstype objects, like (foo.a, k), need to be allocated unless they can be shown not to escape the function.

4 Likes
#6

Does it mean that one has to use @inline when implementing a new method for Base.iterate ? The tuples returned by iterate are likely to wrap non isbits objects for complex cases, right?

#7

In complex cases the allocation is unlikely to have a significant impact on the performance. But sure, with Julia’s current escape analysis, you sometimes have to help it by benchmarking and seeing if forcibly @inline-ing functions improve performance.

1 Like
#8

Perfect! That solves my question. Thanks for helping!