Before I found accumulate, I had written a simple implementation. Then I was informed about accumulate! and tried using that. @btime’ing it showed allocations and that it took nearly twice as long to run as the implementation I wrote. Further investigation indicated that the extra time appeared to be in the argument handling. If I bypassed accumulate! to call Base._accumulate!, it would run at the same speed as the one I wrote:
f1(a, x) = (a[2], a[2]+x);
v1 = rand(100);
buf1 = Vector{NTuple{2,Float64}}(undef, length(v1));
init = (0.0, 0.0);
r1 = @btime accumulate!($f1, $buf1, $v1; init=$init);
> 103.814 ns (3 allocations: 96 bytes)
r2 = @btime Base._accumulate!($f1, $buf1, $v1, nothing, Some($init));
> 57.447 ns (0 allocations: 0 bytes)
r1 == r2
> true
Examining the code, it’s doing some conditionals around the keyword arguments. I understand this is to support nothing as a valid value for init. This seems to be a high performance cost for that generality. I tried various attempts in that single method to speed this up while keeping that, but couldn’t find anything that resolved the performance issue. Requiring the user to pass in Some(nothing) if they needed that to be the init could work, but changing the signature or adding a new one is probably undesirable at this point.
However, I think I found a solution. Would it work to use a special private value to signify nothing so the argument default would work? Something like this:
struct _DefinitelyNothingThisTime end
function test_accumulate!(op, B, A; dims::Union{Integer, Nothing} = nothing, init = _DefinitelyNothingThisTime)
Base._accumulate!(op, B, A, dims, init === _DefinitelyNothingThisTime ? nothing : Some(init))
end
It seems to work (with init, without init, and for init=nothing) and performs well. Is there any issue with that approach?