Memory on array element assignment

I am writing some sort of histogramming function and am having trouble with bad performance due to dynamic allocations in each assignment. It boils down to the following simple problem:

julia> a=zeros(UInt16,1000000);

julia> @time for n=1:size(a,1); a[n]=mod(n,10);end
  0.169991 seconds (3.00 M allocations: 61.067 MiB, 52.07% gc time, 1.82% compilation time)

julia> @time for n=1:1000000; a[n]=mod(n,10);end
  0.022683 seconds (999.49 k allocations: 15.251 MiB)

Is there a way to avoid the 3 Million allocations in the middle example?

Seems like you’re just seeing the usual performance issues with global variables: Performance Tips · The Julia Language

What happens if you avoid global variables?

2 Likes
julia> function foo()
       b=zeros(UInt16,1000000);
       for n=1:size(b,1)
          b[n]=mod(n,10)
       end
       end
foo (generic function with 1 method)


julia> @time foo()
  0.002235 seconds (2 allocations: 1.907 MiB)

Seems like you are right.

1 Like

So I tried to dig deeper and it turned out that a datatype argument is at the bottom of my problem:

julia> function foo(dtype = UInt32)
              b=zeros(dtype,350,20,1,128);
              for n = 1:length(b)
                 b[n] = one(dtype) 
              end
       end
foo (generic function with 2 methods)

julia> @time foo()
  0.085849 seconds (895.49 k allocations: 17.082 MiB, 7.75% gc time)

If you replace the dtype directly by UInt32 the allocations are gone. I guess this should be handled via an proper explicit dispatch, but it is a bit odd that these allocations are appearing.

At first I thought that foo was type-unstable, but since it always returns nothing::Nothing, I don’t think that satisfies the definition of type-instability. However, if you modify the function to accept a number of a specific type, then the allocation also goes away:

julia> function foo2(x = one(UInt32))
           dtype = typeof(x)
           b=zeros(dtype,350,20,1,128)
           for n = 1:length(b)
               b[n] = one(dtype)
           end
       end
foo2 (generic function with 2 methods)

julia> foo2(); @time foo2()
  0.000715 seconds (3 allocations: 3.418 MiB)

julia> foo2(1.0); @time foo2(1.0)
  0.001424 seconds (3 allocations: 6.836 MiB)
1 Like

The correct signature is

function foo(::Type{T}) where {T}

Otherwise, you will not get method specialization, and you’ll just get a generic method foo(::DataType)

5 Likes

BTW, this function returns nothing, and has no effects that are observable from the outside. In principle, it can be simplified to

foo(dtype) = nothing

To be sure some work really happens, return the array.

1 Like

https://docs.julialang.org/en/v1/manual/performance-tips/#Be-aware-of-when-Julia-avoids-specializing

As a heuristic, Julia avoids automatically specializing on argument type parameters in three specific cases: Type , Function , and Vararg . Julia will always specialize when the argument is used within the method, but not if the argument is just passed through to another function. This usually has no performance impact at runtime and improves compiler performance. If you find it does have a performance impact at runtime in your case, you can trigger specialization by adding a type parameter to the method declaration.

2 Likes

You are right, but then it is all the more surprising that the function causes so many allocations. If you return b, you also obtain these allocations.

Interesting point. However it is still a bit unclear, why allocations result by just this simple assignment. Is this really unavoidable due to the resulting dynamic typing?

I didn’t mean to say that the compiler converts the program to do nothing, only that it could, in principle, do that in the not-so-distant future.

Not sure I understand the question correctly, but I don’t think this has to do with assignment. The allocations occur when allocating and initializing the array, b, and when subsequently updating its elements.