reduce(hcat, v) is optimized AFAIU.
Just saw that https://github.com/JuliaLang/julia/pull/27188
So isn’t that the answer to the question?
I guess it may be possible to optimize
Base._typed_vcat (which implements it) further for some special cases (eg sparse arrays), but yes, #27188 takes care of this in a very nice way.
This also suggests the following general strategy: whenever one feels like coding a special-cased implementation for something like this, they might consider making the extra marginal effort of wrapping it up as a PR.
Sometimes not so marginal. A while ago I wrote this (simplified here)
function merge_dict(ds...) t = eltype(ds) merge_dict!(t(),ds) end function merge_dict!(d::AbstractDict, others) for other in others @inbounds for (k,v) in other d[k] = v end end return d end
julia> v = [Dict(rand(1:1000) => rand() for _ in 1:10) for _ in 1:10^4]; julia> @btime merge_dict(v...); 2.191 ms (21 allocations: 327.00 KiB) julia> @btime merge(v...); ERROR: StackOverflowError: julia> v = [Dict(rand(1:1000) => rand() for _ in 1:10) for _ in 1:10^3]; julia> @btime merge_dict(v...); 236.527 μs (21 allocations: 116.06 KiB) julia> @btime merge(v...); 49.820 ms (11963 allocations: 27.52 MiB)
I submitted a PR about a year ago making this a method for
merge. But, my version breaks type-stability, important in some cases, obviously not here. Still my code is useful in many situations (and I actually a use a version that takes a
Vector rather than splatting. However splatting 10^3 and 10^4 arguments entails very little performance penalty.)
So I changed the PR, upon suggestion, to improve the efficiency of
reduce(merge, dicts) instead. There were many choices, but getting the basic implementation was fast. Pushing the PR through was an odyssey. Because
merge are so widely used, things have to be done carefully. I forgot, just looked now, that the PR still has not been merged. It was bumped less than a day ago.
It was a great learning experience, but quite a bit of effort. This depends also on how experienced you are at making PRs to base.
just a curious: what is the reason to use
(BTW I needed to add word “function” to avoid Interrobang )
That’s a good question. It doesn’t seem to make sense. My best recollection is that I copied it from code in base. Or maybe I tried it and got a performance improvement.
FWIW the relevant concept to implement here is possibly flatmap. Then flatten(vv) = flatmap(x->x, vv)
My understanding is that with the PR above having been merged, this is now handled nicely in
This is the one function that I keep wanting to be in core Julia because I keep reimplementing or googling this function so much. I don’t think splatting huge arrays gives you good performance. Something like concatenate or vconcatenate and hconcatenate, analogous to maximum and max, would be really nice.
Tamas solution is the one implemented in Base intended to solve this exact problem
[ Scratch the flatmap suggestion (for now). ]
reduce(vcat, vector_of_vectors) seems to imply iteratively applying vcat — which seems like a possibly an inefficient way of doing this — and certainly there is nothing in the documentation to suggest otherwise. You can look at the code, but we shouldn’t want people to have to dive through reduce.jl just to understand what their code is doing. There is are some hints about reduce having specific implementations for different operations — which I mostly read as “this will do what you want, just trust us, but you will never know what your code does”.
I suppose it would be nice to know (i.e. be true and have documented), that there wasn’t one vector allocation happening per vector in vector_of_vectors. The other thing one might hope to happen would be for there to be just one allocation, though I suppose log(N=total size) might not be that bad (e.g. for appending to a growing vector). In any case someone using reduce(vcat, vector_of_vectors) is going to find it hard to know whether there are N allocations happening, or log(N), or 1 allocation, and whether either of the two “efficient” flatten approaches (filling up one pre-allocated vector or appending to a vector) is being used, or something else altogether.
The documentation for reduce says:
" Reductions for certain commonly-used operators may have special implementations, and should be used instead:
So I suppose to be consistent with the design intent expressed in the documentation that there should be a new function for the special implementation needed for reduce with the vcat operator.
You seem to be misunderstanding something — there is a specialized implementation of
reduce(vcat, ...), implemented by the PR linked above. The entry point is here.
It does the right thing, no extra allocations. This should be both efficient and clear, a win-win.
This is not true — of course you can learn what the code does, rather easily. Julia is open source.
If you think this could be documented better, please make a pull request:
(Also, generally, it is not recommended to resurrect old topics unless there is a good reason.)
Thanks. I also just used @benchmark to see how many allocations were being made and observed just 1.
Where would the documentation of this feature go? It’s a feature of reduce, but someone looking for advice on how to properly flatten a vector of vectors would be unlikely to look there.
On a completely personal note - I personally find myself not wanting to use reduce because the documentation explicitly states there are no guarantees either now, or in the future, on what it does.
I guess the docstring for
reduce should mention this. Anyway somebody looking for this is more likely to look at
?reduce than at the docstring for a function… whose name he’s precisely looking for!
I’m surprised by the recommendation to use special methods like
maximum, as these are defined precisely in terms of
reduce. Maybe these are out of date. Can you file an issue in GitHub?
If the docs are giving you that impression we should tweak them a bit. Do you refer to the following?
If provided, the initial value
initmust be a neutral element for
opthat will be returned for empty collections. It is unspecified whether
initis used for non-empty collections.
This does seem strange especially given that one of the examples violates this rule by using something other than the multiplicative identity in
reduce(*, [2; 3; 4]; init=-1) == -24, demonstrating that the value of
init is used in this case.
The wording looks somewhat out of date but there’s also the matter of what’s easier to read. Personally I’d prefer to see code using
sum rather than
reduce(+, xs). Having said that, I do wonder whether a sufficiently nice syntax for reduction could let us delete some of the special purpose reduction verbs in the future. There was some interesting speculation about this at https://github.com/JuliaLang/julia/issues/32860 (read further than the initial proposal — there’s a range of options discussed there).
I think it’s interesting that we currently conflate binary operators like
vcat(a,b), with a lifted version of the binary operator
vcat_(xs...) = reduce(vcat, xs). This seems similar to the conflation of vectorized functions with their scalar versions prior to broadcast syntax being introduced. The introduction of broadcast syntax let us delete all the “broadcast-lifted” (ie, vectorized) methods. Similarly, having a reduction syntax would let us delete all the “reduction-lifted” methods.
This is not just a matter of being easier to read.
reduce(+, xs) must return the result of applying
+ repeatedly. The
sum function on the other hand is free to have a different return type than that of
+, reducing the risk of integer overflow.
julia> xs = Int8[100 27 1]; julia> sum(xs) 128 julia> reduce(+, xs) -128
On this topic, I think that a
flatmap verb would still be useful:
- Different typing of results
- allows to map
- verb that is obviously correct+optimized
- directly maps from concept that people know from other languages
(3) is not just docs, it is the fact that it is non-apparent from the name. Hence, it would often be better to spell this like
reduce(vcat, stuff) #special cased in Base, faster than it looks. Being correct is not enough, code should try to be obviously correct (which is a much higher bar). That is a question of coding style, though.