Kwargs allocating (?)

Hi there,

consider the following simple function:

    M = [1.0  2.0;
        3.0  4.0]
    n = size(M,1)
    I_n = Matrix{Float64}(I, n, n)
    A_temp = zeros(n, n)
    diff_temp = zeros(n, n)
    function inf_norm_diff_type_stable_no_allocs2(args, M, I_n, A_temp, diff)
        # Collect the scalar inputs into a vector
        for i in eachindex(A_temp)
            A_temp[i] = args[i]
        end
        # Compute the difference A*M - I
        mul!(diff, A_temp, M)
        @. diff -= I_n
        return mapreduce(x -> x^2, +, diff)
    end

As expected this has no allocations

@btime inf_norm_diff_type_stable_no_allocs2($(rand(4)), $(M), $(I_n), $(A_temp), $(diff_temp))
  21.314 ns (0 allocations: 0 bytes)

However, consider the following modification of passing some of the arguments as kwargs:

    function inf_norm_diff_type_stable(args...; M, I_n, A_temp, diff)
        # Collect the scalar inputs into a vector
        for i in eachindex(A_temp)
            A_temp[i] = args[i]
        end
        # Compute the difference A*M - I
        mul!(diff, A_temp, M)
        @. diff -= I_n
        return mapreduce(x -> x^2, +, diff)
    end
julia> @btime inf_norm_diff_type_stable($(rand()), $(rand()), $(rand()), $(rand()); M = $(M), I_n = $(I_n), A_temp = $(A_temp), diff = $(diff_temp))
  36.374 ns (5 allocations: 80 bytes)

It does allocate!
Interestingly, the issue seems to come from the @. line since:

    function inf_norm_diff_type_stable2(args...; M, I_n, A_temp, diff)
        # Collect the scalar inputs into a vector
        for i in eachindex(A_temp)
            A_temp[i] = args[i]
        end
        # Compute the difference A*M - I
        mul!(diff, A_temp, M)
        #@. diff -= I_n
        return mapreduce(x -> x^2, +, diff)
    end
julia> @btime inf_norm_diff_type_stable2($(rand()), $(rand()), $(rand()), $(rand()); M = $(M), I_n = $(I_n), A_temp = $(A_temp), diff = $(diff_temp))
  12.929 ns (0 allocations: 0 bytes)

Comparing @code_warntype and @code_lowered outputs of each is not helping me to identify the issue. Can someone let me know what I’m missing here?

Thanks in advance!

You made another significant change other than keyword arguments, you changed args to args..., and that’s the real cause of your allocations:

julia> function inf_norm_diff_type_stable3(args; M, I_n, A_temp, diff)
           # Collect the scalar inputs into a vector
           for i in eachindex(A_temp)
               A_temp[i] = args[i]
           end
           # Compute the difference A*M - I
           mul!(diff, A_temp, M)
           @. diff -= I_n
           return mapreduce(x -> x^2, +, diff)
       end
inf_norm_diff_type_stable3 (generic function with 1 method)

julia> @btime inf_norm_diff_type_stable3($(rand(4)); M = $(M), I_n = $(I_n), A_temp = $(A_temp), diff = $(diff_temp))
  32.460 ns (0 allocations: 0 bytes)
14.74157464731775

Contrary to what @code_warntype and related methods assume (and should warn users about), the compiler does not automatically specialize (at least, not fully) a method over Function/Type/Vararg inputs, so args[i] is type-unstable. If you want to opt into specialization, you need method parameters, see the Performance Tips for examples.

As for why commenting out @. diff -= I_n resulted in zero allocations, I don’t know, I don’t think the for-loop becomes dead code.

1 Like

Thanks! Useful to know that iterating over Vararg inputs is type-unstable. Its hard to tell from @code_warntype so thanks for letting me know.

I am still puzzled by why the type-instability seems to go aways when commenting out @. diff -= I_n

If anyone know why, it would be great.

Quickly factchecking myself, args[i] might be type-stable, or at least potentially. It’s harder to see in the underlying method that moves all the keyword arguments to the front, but it does specialize over all these keyword arguments’ types, including diff and I_n, and the types of args elements, just not the number of them:

julia> methods(var"#inf_norm_diff_type_stable2#4")[1].specializations
MethodInstance for var"#inf_norm_diff_type_stable2#4"(::Matrix{Float64}, ::Matrix{Float64}, ::Matrix{Float64}, ::Matrix{Float64}, ::typeof(inf_norm_diff_type_stable2), ::Float64, ::Vararg{Float64})

Unfortunately @code_llvm also assumes full specialization, so that doesn’t help spot where and why there are allocations.

1 Like

Thanks for factchecking yourself and taking the time to go deeper. Hopefully we get at why these allocations are happening in the end.

1 Like