Using hcat with splatting on arrays of different lengths kills Julia

In my current project I have this workflow of creating a lot of 1D arrays, hcat-ing into a matrix and then transforming into a sparse representation, as shown bellow (without the sparse call).

function getData(n::Int)
	X = Vector{Vector{Float64}}(0)
	for i = 1:n
		a = rand(2053)
		a[a .> 0.05] .= 0.0
		push!(X, a)
	end
	hcat(X...)
end

i = 1
while true
    A = getData(rand(1:20000))
end

This creates an issue when the number of arrays n varies a lot, because Julia compiles a “new” hcat(X...) for each n, that is encountered (my intuition). Though the memory consumption slows down as there eventually are fewer methods to compile, it can easily exceed even quite modest barriers. For example the code above fails after a while when running with a 4GB memory limit and in a multiprocessing environment this behavior is even more devastating.

I have been able to solve this problem by writing my own version of hcat, however I feel like it deserves more general solution as it affects other functions, where such argument splatting is used.

Use reduce(hcat, X) instead, which calls an efficient specialized method. In general one should avoid splatting collections whose number of elements can vary a lot, since as you noted it triggers a lot of compilation.

Thanks for the answer. I was worried about the performance of such workaround, however it seems, there is a striking difference between 0.6 and 0.7 reduce function. In the former version reduce did not scale really well, as it had big memory overhead, however with the later the reduce function seams to be even faster. I have posted some benchmarks in the other topic, where this issue came up - Problem with cat() - #5 by Tomas_Pevny.