# Collect model outputs into multi-dimensional array

Consider the following model which creates an output after every layer

function track(model, input)
track::Vector{Vector{Float64}} =

``````p = input
for layer in model
p = layer(p)
track = vcat(track, [p,])
end
return track
``````

end

which is unfortunately quite inefficient. Is there a way to store the outputs more efficiently into preallocated memory? Ideally into a mutli-dimensional array?

Sure, if you know the size(s) beforehand (and things are rectangular, i.e. size of the output of each layer is the same) E.g. for 5 layers, each with an output of length 64, you can initialize a multi(i.e. 2)-dimensional array like this:

``````track = Matrix{Float64}(undef, 64, 5)
``````

And then loop over the layers like this:

``````for (i, layer) in enumerate(model)
p = layer(p)
track[:, i] = p
end
``````

Not that this still allocates a vector `p` and then copies it into your matrix. To avoid allocations completely, you would want to rewrite your `layer` function to operate in-place on a pre-allocated vector (and then pass in e.g. `@view track[:, i]` to write the results directly into the matrix column).

1 Like

Unfortunately this approach is not supported by Flux/Zygote

Mutating arrays is not supported – called setindex!(::Matrix{Float64}, _…)

But I guess there must be some workaround for this since the array is not mutated in a mathematical sense. In contrast to in-place operations which override data necessary for calculation of gradients, this just stores results into memory.

I don’t think it was clear up front that this was in a gradient context, because that changes the solution space quite dramatically. There are two ways you can go about this in Zygote:

1. Use `Zygote.Buffer` instead of a plain array.
2. Put your layers in a `Chain`, call `Flux.activations` to get a set of outputs and then `reduce(hcat, outputs)` to allocate the array once.

#1 is your best shot (short of writing AD rules) for the in-place option @stevengj described. #2 may be faster if you can live with the `p = layer(p)` allocation for each layer, but I would try both just to be sure.

2 Likes

Thanks a lot!

In case I stick with ‘p = layer(p)’ Flux.activations outperforms Zygote.Buffer significantly:

60.174614 seconds (79.67 M allocations: 16.958 GiB, 3.24% gc time)

vs

210.037932 seconds (691.79 M allocations: 49.948 GiB, 4.98% gc time)