Very best way to concatenate array of arrays, while applying a function

Continuing on this nice dicussion about reduce would there be a way to use the same syntax while applying another function, if at al relevant.

Basically, I have some calls that look like reduce(vcat, fun.(array_of_array)) where I want to apply fun to all items in array_of_array. Given that I am not interested in the result of a single call of fun, would it make a difference to have something like a reduce(vcat, a -> fun(a), array_of_array) ?

I am not familiar enough with reduce to say if something already exists or not to do that…

Didn’t look at the other discussion but sounds just like mapreduce to me.

You could use mapreduce:

arr_of_arr = [[rand() for _ in 1:3] for _ in 1:5]

fun(x) = x / 2
mapreduce(fun, vcat, arr_of_arr)

edit: I was a little bit too slow

[rand(3) for _ in 1:5]

Another issue is if you want the map function to work element-wise on each subarray.

Note that reduce(vcat, fun.(array)) may well hit a special method for vectors of vectors or matrices, which allocates one array to write into. There is no such method for mapreduce(fun, vcat, array), and thus it will call vcat many times, allocating length(array)-1 arrays in the process. This will probably be much more expensive than the one temporary array from fun.(array).

If fun(a) returns a vector, then you can try vec(stack(fun, array)), with using Compat first.

3 Likes

Damn, I forgot mapreduce. I do not have that map reflex yet.

I just benchmarked it quickly and it turns out your are very much right about this. For instance:

fun = x -> 3*x
arr_of_arr = [[rand() for _ in 1:1000] for _ in 1:100]
 @benchmark reduce(vcat,$fun.($arr_of_arr))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  136.800 ΞΌs …   5.901 ms  β”Š GC (min … max):  0.00% … 96.36%
 Time  (median):     170.600 ΞΌs               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   224.968 ΞΌs Β± 309.902 ΞΌs  β”Š GC (mean Β± Οƒ):  19.03% Β± 13.03%

  β–ˆβ–ˆβ–„β–‚β–β–‚β–  ▁                                                  ▁ β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–ˆβ–‡β–†β–…β–„β–„β–ƒβ–„β–ƒβ–„β–„β–β–ƒβ–β–β–ƒβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–„β–…β–‡β–†β–†β–ˆ β–ˆ
  137 ΞΌs        Histogram: log(frequency) by time       2.09 ms <

 Memory estimate: 1.54 MiB, allocs estimate: 104.

julia> @benchmark mapreduce($fun,vcat,$arr_of_arr)
BenchmarkTools.Trial: 1462 samples with 1 evaluation.
 Range (min … max):  2.166 ms … 15.543 ms  β”Š GC (min … max):  0.00% … 46.46%
 Time  (median):     3.730 ms              β”Š GC (median):    35.78%
 Time  (mean Β± Οƒ):   3.402 ms Β±  1.167 ms  β”Š GC (mean Β± Οƒ):  23.45% Β± 17.79%

  β–‚β–…β–†β–ˆβ–…β–‚β–‚β–‚  ▁       ▁▇▇▇▆▄▃▂  ▁                              ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–‡β–„β–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–…β–‡β–‡β–‡β–†β–†β–†β–†β–†β–„β–β–…β–β–†β–β–„β–…β–β–…β–β–β–„β–β–„β–β–„β–β–β–„ β–ˆ
  2.17 ms      Histogram: log(frequency) by time      6.8 ms <

 Memory estimate: 39.30 MiB, allocs estimate: 297.

Do you know why mapreduce cannot hit the same methods than reduce ?

That’s becaues reduce can check the sizes beforehand to know how much memory to allocate, but mapreduce cannot know the sizes of arrays output by the user function before it is called.

This is a strong statement and I only half mean it, but… I kind of wish I had never learned about mapreduce. I reach for it quite often now but I usually find out that it is doing things in a pretty inefficient way (because of the shortcomings listed here).

using compat

julia> @benchmark reduce(vcat,$fun.($arr_of_arr))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  154.500 ΞΌs …   4.178 ms  β”Š GC (min … max):  0.00% … 83.20%        
 Time  (median):     215.400 ΞΌs               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   284.858 ΞΌs Β± 259.627 ΞΌs  β”Š GC (mean Β± Οƒ):  14.05% Β± 14.42%        

  β–„β–‡β–ˆβ–‡β–†β–…β–„β–„β–ƒβ–ƒβ–‚β–                                      ▁           β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–†β–†β–†β–†β–†β–…β–„β–„β–„β–„β–„β–ƒβ–„β–„β–„β–„β–ƒβ–…β–ƒβ–‚β–…β–„β–…β–„β–ƒβ–„β–„β–…β–ƒβ–„β–†β–ˆβ–ˆβ–‡β–‡β–‡β–†β–†β–†β–‡β–†β–†β–† β–ˆ
  154 ΞΌs        Histogram: log(frequency) by time       1.43 ms <

 Memory estimate: 1.54 MiB, allocs estimate: 104.

julia> @benchmark mapreduce($fun,vcat,$arr_of_arr)
BenchmarkTools.Trial: 922 samples with 1 evaluation.
 Range (min … max):  3.434 ms … 15.972 ms  β”Š GC (min … max):  0.00% … 44.82%
 Time  (median):     4.903 ms              β”Š GC (median):    15.35%
 Time  (mean Β± Οƒ):   5.394 ms Β±  1.416 ms  β”Š GC (mean Β± Οƒ):  14.93% Β±  6.12%

          β–ˆβ–…β–„β–β–‚
  β–‚β–ƒβ–„β–‚β–ƒβ–ƒβ–ƒβ–…β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–„β–…β–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–‚β–ƒβ–ƒβ–‚β–ƒβ–‚β–ƒβ–ƒβ–‚β–ƒβ–ƒβ–‚β–ƒβ–ƒβ–„β–ƒβ–ƒβ–‚β–‚β–ƒβ–β–‚β–‚β–‚β–‚β–‚β–β–β–β–β–β–β–β–‚ β–ƒ
  3.43 ms        Histogram: frequency by time        10.8 ms <

 Memory estimate: 39.30 MiB, allocs estimate: 297.

julia> @benchmark vec(stack($fun, $arr_of_arr))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  116.700 ΞΌs …   3.675 ms  β”Š GC (min … max):  0.00% … 85.37%        
 Time  (median):     194.900 ΞΌs               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   252.690 ΞΌs Β± 194.317 ΞΌs  β”Š GC (mean Β± Οƒ):  11.77% Β± 14.80%        

  ▁ β–‚β–ˆβ–ˆβ–†β–…β–…β–„β–„β–„β–ƒβ–‚β–β–                                               β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–…β–‡β–†β–†β–†β–†β–…β–…β–…β–…β–„β–…β–„β–…β–…β–†β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–†β–†β–‡β–†β–‡β–†β–†β–…β–†β–†β–…β–…β–…β–…β–…β–…β–ƒβ–ƒβ–… β–ˆ
  117 ΞΌs        Histogram: log(frequency) by time       1.15 ms <

 Memory estimate: 1.54 MiB, allocs estimate: 104.

By the way, this will only work if fun always returns a vector of the same length.

This is true, although it could still do better than pairwise vcat, for instance by guessing that all are the same size as the first. Or by repeated doubling like append!.

Indeed, this is one of the things which makes stack easier.

Mostly nobody has got around to it. PR 31636 had a go. Another option would simply be to send this to reduce(vcat, map(f, xs)), as this will usually be better. (This is also what happens for mapreduce(f, op, xs, ys) at present.)

The other option would be to write a flatten companion for stack, since these clever reduce overloads are hard to discover & quite fragile – it’s easy to step off the fast path, as with init keyword for example.