Very best way to concatenate an array of arrays

reduce(hcat, v) is optimized AFAIU.

3 Likes

Just saw that https://github.com/JuliaLang/julia/pull/27188
So isn’t that the answer to the question?

I guess it may be possible to optimize Base._typed_vcat (which implements it) further for some special cases (eg sparse arrays), but yes, #27188 takes care of this in a very nice way.

This also suggests the following general strategy: whenever one feels like coding a special-cased implementation for something like this, they might consider making the extra marginal effort of wrapping it up as a PR.

1 Like

Sometimes not so marginal. A while ago I wrote this (simplified here)

function merge_dict(ds...)
    t = eltype(ds)
    merge_dict!(t(),ds)
end

function merge_dict!(d::AbstractDict, others)
    for other in others
        @inbounds for (k,v) in other
            d[k] = v
        end
    end
    return d
end
julia> v = [Dict(rand(1:1000) => rand() for _ in 1:10) for _ in 1:10^4];

julia> @btime merge_dict(v...);
  2.191 ms (21 allocations: 327.00 KiB)

julia> @btime merge(v...);
ERROR: StackOverflowError:

julia> v = [Dict(rand(1:1000) => rand() for _ in 1:10) for _ in 1:10^3];

julia> @btime merge_dict(v...);
  236.527 μs (21 allocations: 116.06 KiB)

julia> @btime merge(v...);
  49.820 ms (11963 allocations: 27.52 MiB)

I submitted a PR about a year ago making this a method for merge. But, my version breaks type-stability, important in some cases, obviously not here. Still my code is useful in many situations (and I actually a use a version that takes a Vector rather than splatting. However splatting 10^3 and 10^4 arguments entails very little performance penalty.)

So I changed the PR, upon suggestion, to improve the efficiency of reduce(merge, dicts) instead. There were many choices, but getting the basic implementation was fast. Pushing the PR through was an odyssey. Because reduce, Dicts, and merge are so widely used, things have to be done carefully. I forgot, just looked now, that the PR still has not been merged. It was bumped less than a day ago.

It was a great learning experience, but quite a bit of effort. This depends also on how experienced you are at making PRs to base.

1 Like

just a curious: what is the reason to use @inbounds in merge_dict! function?

(BTW I needed to add word “function” to avoid Interrobang :slight_smile: )

That’s a good question. It doesn’t seem to make sense. My best recollection is that I copied it from code in base. Or maybe I tried it and got a performance improvement.

1 Like

FWIW the relevant concept to implement here is possibly flatmap. Then flatten(vv) = flatmap(x->x, vv)

My understanding is that with the PR above having been merged, this is now handled nicely in Base with reduce(vcat, ...).

2 Likes

This is the one function that I keep wanting to be in core Julia because I keep reimplementing or googling this function so much. I don’t think splatting huge arrays gives you good performance. Something like concatenate or vconcatenate and hconcatenate, analogous to maximum and max, would be really nice.

Tamas solution is the one implemented in Base intended to solve this exact problem

[ Scratch the flatmap suggestion (for now). ]

reduce(vcat, vector_of_vectors) seems to imply iteratively applying vcat — which seems like a possibly an inefficient way of doing this — and certainly there is nothing in the documentation to suggest otherwise. You can look at the code, but we shouldn’t want people to have to dive through reduce.jl just to understand what their code is doing. There is are some hints about reduce having specific implementations for different operations — which I mostly read as “this will do what you want, just trust us, but you will never know what your code does”.

I suppose it would be nice to know (i.e. be true and have documented), that there wasn’t one vector allocation happening per vector in vector_of_vectors. The other thing one might hope to happen would be for there to be just one allocation, though I suppose log(N=total size) might not be that bad (e.g. for appending to a growing vector). In any case someone using reduce(vcat, vector_of_vectors) is going to find it hard to know whether there are N allocations happening, or log(N), or 1 allocation, and whether either of the two “efficient” flatten approaches (filling up one pre-allocated vector or appending to a vector) is being used, or something else altogether.

The documentation for reduce says:

" Reductions for certain commonly-used operators may have special implementations, and should be used instead: maximum(itr) , minimum(itr) , sum(itr) , prod(itr) , any(itr) , all(itr) ."

So I suppose to be consistent with the design intent expressed in the documentation that there should be a new function for the special implementation needed for reduce with the vcat operator.

You seem to be misunderstanding something — there is a specialized implementation of reduce(vcat, ...), implemented by the PR linked above. The entry point is here.

It does the right thing, no extra allocations. This should be both efficient and clear, a win-win.

This is not true — of course you can learn what the code does, rather easily. Julia is open source.

If you think this could be documented better, please make a pull request:

(Also, generally, it is not recommended to resurrect old topics unless there is a good reason.)

2 Likes

Thanks. I also just used @benchmark to see how many allocations were being made and observed just 1.

Where would the documentation of this feature go? It’s a feature of reduce, but someone looking for advice on how to properly flatten a vector of vectors would be unlikely to look there.

On a completely personal note - I personally find myself not wanting to use reduce because the documentation explicitly states there are no guarantees either now, or in the future, on what it does.

1 Like

I guess the docstring for reduce should mention this. Anyway somebody looking for this is more likely to look at ?reduce than at the docstring for a function… whose name he’s precisely looking for!

I’m surprised by the recommendation to use special methods like maximum, as these are defined precisely in terms of reduce. Maybe these are out of date. Can you file an issue in GitHub?

1 Like

If the docs are giving you that impression we should tweak them a bit. Do you refer to the following?

If provided, the initial value init must be a neutral element for op that will be returned for empty collections. It is unspecified whether init is used for non-empty collections.

This does seem strange especially given that one of the examples violates this rule by using something other than the multiplicative identity in reduce(*, [2; 3; 4]; init=-1) == -24, demonstrating that the value of init is used in this case.

3 Likes

The wording looks somewhat out of date but there’s also the matter of what’s easier to read. Personally I’d prefer to see code using sum rather than reduce(+, xs). Having said that, I do wonder whether a sufficiently nice syntax for reduction could let us delete some of the special purpose reduction verbs in the future. There was some interesting speculation about this at https://github.com/JuliaLang/julia/issues/32860 (read further than the initial proposal — there’s a range of options discussed there).

I think it’s interesting that we currently conflate binary operators like vcat(a,b), with a lifted version of the binary operator vcat_(xs...) = reduce(vcat, xs). This seems similar to the conflation of vectorized functions with their scalar versions prior to broadcast syntax being introduced. The introduction of broadcast syntax let us delete all the “broadcast-lifted” (ie, vectorized) methods. Similarly, having a reduction syntax would let us delete all the “reduction-lifted” methods.

6 Likes

This is not just a matter of being easier to read. reduce(+, xs) must return the result of applying + repeatedly. The sum function on the other hand is free to have a different return type than that of +, reducing the risk of integer overflow.

julia> xs = Int8[100 27 1];

julia> sum(xs)
128

julia> reduce(+, xs)
-128
4 Likes

On this topic, I think that a flatmap verb would still be useful:

  1. Different typing of results
  2. allows to map
  3. verb that is obviously correct+optimized
  4. directly maps from concept that people know from other languages

(3) is not just docs, it is the fact that it is non-apparent from the name. Hence, it would often be better to spell this like reduce(vcat, stuff) #special cased in Base, faster than it looks. Being correct is not enough, code should try to be obviously correct (which is a much higher bar). That is a question of coding style, though.