Looks like there are two issues: slatting, and specialization on the function (by default specialization only happens if function is called). This gets rid of allocations:
function Base.Broadcast.broadcast!{F}(f::F, dest::Foo, src1, src2)
broadcast!(f, unsafe_get(dest), unsafe_get(src1), unsafe_get(src2))
return dest
end
I’m not sure whether there’s a way of avoiding allocations and still use splatting. However, note that the allocation is only 16 bytes per iteration, so if the array is large this may not matter in practice (no copy of the array is made).
BTW, this thread might be useful.