Allocation using broadcasting with custom type

Hi,

I am seeing some suspect allocations using broadcasting defined on a custom type, in particular when the broadcast involves Numbers. The MWE is reported below. My actual code is somewhat larger but this example reproduce the behaviour.

struct Foo{T, A<:AbstractMatrix{T}} <: AbstractMatrix{T}
    data::A
end
Foo(data::A) where {A<:AbstractMatrix} = Foo{eltype(data), A}(data)

@inline Base.unsafe_get(f::Foo) = f.data

# Catch call to broadcast, then rebroadcast to field data
@generated function Base.Broadcast.broadcast!(f, dest::Foo, src::Vararg{Any, N}) where N
    args = [:(unsafe_get(src[$k])) for k = 1:N]
    quote
        broadcast!(f, unsafe_get(dest), $(args...))
        return dest
    end
end


# function that allocates
function bar(out, c::Number, x) 
    for i = 1:10000
        out .= x .* c
    end
    out
end

x   = Foo(randn(100, 100))
out = Foo(randn(100, 100))
c   = 1.0

@show @allocated bar(out, c, x)
@show @allocated bar(out, c, x)
@show @allocated bar(out, c, x)
@show @allocated bar(out, c, x)

The type Foo is the type I want to do broadcasting on, e.g., in the function bar. I have overloaded broadcast! on my custom type using a generated function approach. The above code results in

@allocated(bar(out, c, x)) = 3955676
@allocated(bar(out, c, x)) = 160000
@allocated(bar(out, c, x)) = 160000
@allocated(bar(out, c, x)) = 160000

If you change the line in the loop in bar to out .= x .* x, all allocations disappear. Any pointers are welcome.

Thanks

1 Like

Looks like there are two issues: slatting, and specialization on the function (by default specialization only happens if function is called). This gets rid of allocations:

function Base.Broadcast.broadcast!{F}(f::F, dest::Foo, src1, src2)
    broadcast!(f, unsafe_get(dest), unsafe_get(src1), unsafe_get(src2))
    return dest
end

I’m not sure whether there’s a way of avoiding allocations and still use splatting. However, note that the allocation is only 16 bytes per iteration, so if the array is large this may not matter in practice (no copy of the array is made).

BTW, this thread might be useful.

Thanks! I was getting crazy at understanding why this happens. Note that it only does it with custom types and it does not allocate when arrays are used in the dot notation (bug? can be fixed?)

The allocation is small, but annoying.

Follow-up question: I need to generate many version of this function for different number of arguments.

This code does what I need, I am reporting it here in case someone will ever face a similar issue.

for nargs = 1:10
    args  = [Symbol("src", i) for i = 1:nargs]
    calls = [:(unsafe_get($(args[i]))) for i = 1:nargs]
    @eval @generated function broadcast_c!(f, ::Type{FTField}, ::Type{FTField}, dest, $(args...))
            :(broadcast!(f, unsafe_get(dest), $($calls...)))
          end    
end 

A possible explanation is that the code for Array uses @inline annotations for functions with varargs, and functions which are not inlined take a tuple of arrays rather than varargs. You could take inspiration from it.

1 Like

I might be wrong, but it’s possible that these allocations will go away when non-isbits structs are stored unboxed (I’m guessing this is an non-inlining varargs call that passes a tuple, and the tuple needs to be constructed on the heap…). I.e. Hopefully it’s just a (known) compiler improvement away.

Do you mean in the broadcast code? Could you provide an example in Base?

I meant the method you get with e.g. @less broadcast!(*, [1], 1, [1]), and the methods which are called from there.

1 Like

Ok. But adding inline annotations do not seem to be the cure.