In my current project I have this workflow of creating a lot of 1D arrays, hcat-ing into a matrix and then transforming into a sparse representation, as shown bellow (without the sparse call).
function getData(n::Int)
X = Vector{Vector{Float64}}(0)
for i = 1:n
a = rand(2053)
a[a .> 0.05] .= 0.0
push!(X, a)
end
hcat(X...)
end
i = 1
while true
A = getData(rand(1:20000))
end
This creates an issue when the number of arrays n
varies a lot, because Julia compiles a “new” hcat(X...)
for each n
, that is encountered (my intuition). Though the memory consumption slows down as there eventually are fewer methods to compile, it can easily exceed even quite modest barriers. For example the code above fails after a while when running with a 4GB memory limit and in a multiprocessing environment this behavior is even more devastating.
I have been able to solve this problem by writing my own version of hcat, however I feel like it deserves more general solution as it affects other functions, where such argument splatting is used.