Problem with cat()

Hello,
I’m relatively new to julia, so I’m not sure if the observed behavior is intended or a bug.
I want to concatenate a three-dimensional array as in this simple example:

julia> cat(ones(3,1,1000)..., dims=2)
1×3000 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  …  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

But when the array is bigger, this throws a StackOverflowError:

julia> cat(ones(3,1,10000)..., dims=2)
ERROR: StackOverflowError:

I don’t think that it’s a memory issue because I have no problem to generate such an array directly:

julia> ones(1,30000)
1×30000 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  …  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

I can reproduce this behavior on two Windows 10 systems. Here it the versioninfo() of one of them:

julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = "C:\Users\Arbeit\AppData\Local\atom\app-1.31.1\atom.exe" -a
  JULIA_NUM_THREADS = 2

So, I’m curious, is the behavior to be expected?

cat(ones(3,1,10000)..., dims=2) calls cat with 10000 arguments which is a bit much.

It’s even 30001(!)

Yes, one should be careful with splatting, it’s definitely not the right approach here. It’s a bit unclear what you really need to do, @judober, can’t you just write ones(1, 30000) directly? Or maybe reshape is what you’re looking for.

All right, thank you two. I already solved my problem using reshape (my array is more complex than just ones), so your suggestion was right. I was just wondering if cat() works as expected.

I would like to continue on this, as I use hcat(x...), where x is and array of arrays (or some other objects) quite extensively. I prefer this over reduce(hcat,x), because the former is way faster.

Besides the problem judober has identified, we have just identified another one, which is if hcat(x...) is called many times with x containing different number of element, it slowly consumes all the memory of Julia (because it keeps all versions of arguments) and eventually, it fails.

My question is, what would be the ideal way implement function that takes arbitrary number of arrays and concatenate them at once. Using reduce for this is just inefficient. Should we write our function, something like
hhcat(Vector{T}) where {T}?

Thanks for answers.

Are you certain about that? A find reduce(hcat, x) to be faster. Have you tried timing it like this:

using BenchmarkTools
@btime hcat($x...)
@btime reduce(hcat, $x)

I think that implementing version of hcat/vcat which

  1. takes a vector of arrays,
  2. calculates the final size and type,
  3. does the copying

would be worthwhile.

Whether this could be a method of hcat etc is an API design question. IMO a different function name would be best, but perhaps someone can come up with a signature that would fit in nicely with existing ones.

In julia 0.6.4 the difference is striking.

# 2000 arrays of length 2053
hcat - 21.037 ms (10 allocations: 31.37 MiB)
reduce - 15.698 s (18952 allocations: 15.34 GiB)

# 4000 arrays of length 2053
hcat - 37.280 ms (10 allocations: 62.74 MiB)
reduce - 32.509 s (37906 allocations: 30.75 GiB)

However using 0.7, reduce even takes the lead a little bit.

# 2000 arrays of length 2053
hcat - 13.012 ms (8 allocations: 31.37 MiB)
reduce - 10.009 ms (2 allocations: 31.33 MiB)

# 4000 arrays of length 2053
hcat - 24.859 ms (8 allocations: 62.74 MiB)
reduce - 19.291 ms (2 allocations: 62.65 MiB)

# 10000 arrays of length 2053
hcat - 134.433 ms (8 allocations: 156.86 MiB)
reduce - 125.283 ms (2 allocations: 156.63 MiB)

Yes, that’s due to this PR:
https://github.com/JuliaLang/julia/pull/27188

This is what I have been looking for. Thanks a lot.