Slow `reduce(vcat, itr)`

mikmoore · September 13, 2022, 3:38pm

I aim to deprecate the idiom, not either function. What I mean is to stop encouraging people to use reduce(vcat,...), discourage them from using it unless they’re trying to support old Julia versions, and instead encourage stack for this purpose.

Although very technically, removing the specializations would be non-breaking because a naive call to reduce(vcat,...) would return the correct answer – although with a staggering performance degradation. I don’t seriously propose doing this. The specializations can stick around at least until v2.0 and longer if they’re still entrenched.

aplavin · September 13, 2022, 5:49pm

reduce(vcat,...) is problematic because (…)

Kinda agree: it always works and does what’s needed, but not always performant.
It was a nice idea to try switching to fast path with reduce(vcat, array), but turned out not to be intuitive and lacks predictable performance.

looks like stack might already be the function we need

As planned for 1.9, stack will always, well, “stack” input arrays along a new dimension. That’s different from what reduce(vcat) does.
Why cram yet another functionality into stack? flatten is a nice name, and present in both stdlibs (Iterators) and packages (eg SplitApplyCombine.jl) with the same meaning.

Not really arguing for inclusion of flatten into Base, btw - I think it’s totally fine to have such a function in a package. Base cannot contain all reasonable data manipulation functions anyway, the question is where to draw the line. But also not against inclusion, just neutral.

mikmoore · September 13, 2022, 6:00pm

Thanks. You’re right. First I underestimated what stack was capable and people pointed out that it had a dims argument. But since then I’ve overestimated it in assuming that dims didn’t have to be a new dimension. I don’t have a 1.9 checkout so I haven’t been able to test it, but did take a closer look at the docstring.

So then there is a vacancy for a concat(_;dims) or concatenate(_;dims) function to generalize cat(_...;dims) for collections (as maximum is to max) and supersede reduce(vcat,_) and reduce(hcat,_).

DNF · September 13, 2022, 6:08pm

Installing and using Compat (or using Compat: stack) will allow you to test stack.

mcabbott · September 13, 2022, 7:14pm

Maybe the question here is: How often does anyone use these for operations which aren’t either stack or flatten? (Where flatten works like collect ∘ Iterators.flatten, but faster.)

reduce(vcat, vec_of_matrices) makes a matrix. But my impression is that its main use case is vectors of vectors, with other things just inherited from vcat. Even reduce(vcat, [[i;;;] for i in 1:3]) works, but surely is pretty rare. (Edit: I presume that’s why the optimised method doesn’t bother to include this.) Do we need a new verb for these cases at all?

mikmoore · September 13, 2022, 9:11pm

Fair question. Indeed, I would imagine that a serious majority of cases are covered by stack and some others by flatten. But there is currently no function to efficiently concatenate 3-dimensional arrays along any of the first 3 dimensions (although stack could insert one). One can use reshape gymnastics to make it work in many more cases, but if I was happy with solutions like that I never would have left MATLAB.

Notice that your example produces the correct result (as reduce always does), but is not efficient because it misses the reduce specializations:

julia> z = [[i;;;] for i in 1:3]; @btime reduce(vcat, $z);
  1.210 μs (26 allocations: 960 bytes)

julia> z = [[i;;;] for i in 1:30]; @btime reduce(vcat, $z);
  18.100 μs (377 allocations: 16.66 KiB)

# superior despite a materialized broadcast
julia> z = [[i;;;] for i in 1:30]; @btime reshape(reduce(vcat, reshape.($z,1)),:,1,1);
  956.757 ns (65 allocations: 3.31 KiB)

EDIT: I guess I’m suggesting that concatenation is not an exotic operation and that it’s something that I’d expect a language like Julia (with its otherwise-excellent N-dimensional array support) to handle well. In its current state, it’s a minefield of performance traps that sometimes requires reshape-fu or writing your own concatenation functions.

CameronBieganek · September 13, 2022, 11:03pm

Also agreed. I think this is a case where we got burnt by the power of the language. I imagine someone said, “Oh neat, we can add a reduce(typeof(vcat), ...) method to get fast array concatenation,” which resulted in the reduce(vcat, ...) fast path being added to the language rather than a stack or flatten or concat function.

That’s not to say I wouldn’t have made the same mistake, just that hindsight is 20/20.

Topic		Replies	Views
Iterating over cols, vcat, repeat? Performance	7	881	October 2, 2018
Using hcat with splatting on arrays of different lengths kills Julia General Usage compilation , memory , memory-allocation	2	1395	October 2, 2018
Vector of Vectors with different size to one Vector General Usage vector	9	4269	June 2, 2020
Vcat slow? General Usage	12	1682	June 15, 2020
Thread-parallel reduce(vcat, ...) performance issues when providing init Performance	2	759	May 11, 2021

Slow `reduce(vcat, itr)`

Related topics