Pre-allocation for matrices / vectors of matrices of different length

Hi all,

I am doing a bunch of simulations, and I want to save the outcome of these N simulations
My parameters for each simulation is a p of type Vector{Matrix{Float32}}, and there are M vectors X, Y that I’d like to store for each p.
The dimensions of p, X, and Y are known a priori.

Currently I’m doing something along the lines of

data = Dict()
X_dim = ... # lenght of X
Y_dim = ... # length of Y
for i in 1:N
    p = generate_params()
    Xs = Matrix{Float32}(undef, M, X_dim)
    Ys = Matrix{Float32}(undef, M, Y_dim)
    for j in 1:M
        X = rand(X_dim)
        Y = simulation(X, p)
        Xs[j, :] .= X
        Ys[j, :] .= Y
    merge!(dict, i => (p = p, Xs = Xs, Ys = Ys))

to store the data.
But (unsurprisingly) this leads to a lot of garbage collection time, and I’m wondering if there is a good way of preallocating memory to speed things up.
I suspect there must be, as all dimensions are known, but I can’t figure out an elegant way to do it that still keeps track of all the data easily!

Do you really need a dictionary if all you are doing is indexing simulations with i in 1:N? Why not a vector of matrices?

It seems to me that the only allocations that can be avoided are X and Y.

You need to allocate N instances of p, Xs, and Ys b/c you store those.
Assigning them to the Dict does not allocate. Though I don’t know how merge! operates. d[i] = (p = p, Xs = Xs, Ys = Ys) would seem easier anyway.

To avoid allocating X and Y, in your loop:

@views for j = 1 : M
  Xs[j,:] .= rand(X_dim)
  Ys[j,:] .= simulation(Xs[j,:], p)
  # or simulation!(Ys[j,:], Xs[j,:], p)

You could even do rand!(Xs[j,:])