"Cat" very slow for relativelly small tasks

This make my 16GB system hang for a while and eventually crash with a StackOverflowError, even if top doesn’t show all this memory usage (the matrices themselves should be ~ 400 MB)

mainM = rand(28,28,60000)
toAdd = [rand(28,28) for i in 1:12800]
totM = cat(mainM, toAdd...,dims=3)

curios who is the culprit and which alternatives I could use…

found a workaround:

mainM = rand(28,28,60000)
toAdd = [rand(28,28) for i in 1:12800] 
B     = reduce(hcat, toAdd)
C     = reshape(B, 28, 28, :)
totM  = cat(mainM, C, dims=3)

It seems cat didn’t like the many concatenations deriving from using the splat operator…

If you needed vcat or hcat, then the common suggestion is to use reduce instead of splatting. But I don’t think reduce works performantly with the general cat
So I’d suggest the following solution, somewhat cleaner than reshape:

julia> using SplitApplyCombine
julia> cat(mainM, combinedims(toAdd), dims=3)

Uses an external package, but SplitApplyCombine is useful for many other tasks in working with array- or table-like data as well.

1 Like