What's the proper way to create and initialize a vector of vectors of matrices and vectors?

Hi, I’m trying to create and initialize two objects: x will be a vector of vector of vectors, and y will be a vector of vector of matrices. I tried

x =Vector{Vector{Vector{Float64}}}(undef, 5)
y =Vector{Vector{Matrix{Float64}}}(undef, 5)

But ran into ERROR: UndefRefError: access to undefined reference when trying to fill in the content with x[1][1] = ones(3) or y[1][1] = ones(3,3).

I wonder what is the proper way to do this?

2 Likes

This is because the elements of x and y are undef and you have to define them first.

julia> x = Vector{Vector{Vector{Float64}}}(undef,5);

julia> x[1]
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(::Array{Array{Array{Float64,1},1},1}, ::Int64) at .\array.jl:788
 [2] top-level scope at REPL[2]:1

julia> x[1] = Vector{Vector{Float64}}(undef,2)
2-element Array{Array{Float64,1},1}:
 #undef
 #undef

julia> x[1][1] = ones(3)
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

The data structure you are constructing is fairly complicated: what are you trying to achieve?

2 Likes

I think the types you are using are a little worrying (and this is coming from someone that uses Tuples and Vectors for everything). The easiest way to initialize them are probably list comprehensions, for example:

julia> [[Vector{Float64}(undef, 4) for _ = 1:5] for _ = 1:6]
5-element Array{Array{Array{Float64,1},1},1}:
...

This list comprehension gives a Vector{Vector{Vector{Float64}}}, it is a vector of six elements, each element is a vector of size five, and each of those five is a vector of four elements inside.

3 Likes

Thanks for the solution!

This is for constructing a type that will be used for saving results from simulations. The simulation runs on different subsets (the first Vector), on each subset there are many repeats (the second vector), at each repeat I would get results as vectors and matrices (the third Vector, or Matrix).

I’ve thought about whether to use NameTuples, but from what I read, I don’t think there’s an advantage in using them. What’s your thought?

Thanks for your solution too.

I think the types you are using are a little worrying (and this is coming from someone that uses Tuple s and Vector s for everything).

Is there a different structure you would suggest? Like I explained in my response to @mzaffalon, I’ve thought about using NamedTuples, but thought that Vector just seems easier to implement.

If all the elements of some outer Vector will be Vectors of some shared and fixed size, I would suggest using an Array{Float64, 3} for your x (and a Array{Float64, 4} for your y).

If you need “sibling” Vectors to have different sizes (e.g., x[1] and x[2], or x[1][2] and x[1][3]), then the things complicate, your solution can end up being the best one, but you could also define struct types for ‘Sample{T}’ and ‘Experiment{T}’ so you can shorten the type to Experiment{T} that has a field configs :: Vector{Sample{T}} or a Dict{Symbol, Sample{T}} inside (if you prefer to give names to the configs, instead of using numbers), and Sample{T} has a field datapoints :: Vector{T} (or maybe named results). It is up to you.

Probably what I find more strange is that you are defining and preallocating the whole structure before running the experiments and need to write/define the whole type in a line. The way I wrote Julia code, I can end up with a “Vector of Vector of Vector of Type T” but probably I will have some function that executes a map over some data, and the function that is applied also executes a map of a function that ends up returning a Vector so I end up with the same three-layered object but I never had to explicit its type, it is just the return of functions I call that allocate the vectors of the right types and sizes for me.

7 Likes

I would also go with @Henrique_Becker’s solution: define a structure Simulation with perhaps a struct’s field containing the simulation parameters and another field with a Vector{Run}s (this would be your second inner vector).

Struct Run contains a Vector{Float64} with the simulation solution (and one field with Matrix{Float64}, if the case be). If each subset needs additional information, you can add capture this as an extra field to Run.

The reason not to go with Vector{Vector{Vector}} is because more sooner than later, you will find yourself in the position of having forgotten how the data is saved and what results belongs to what simulation.

struct Run
  vector_result::Vector{Float64}
  matrix_result::Matrix{Float64}
end

struct Simulation
  #name::String
  parameters
  runs::Vector{Run}
end

s = Simulation(some_parameters, Vector{Run}())

At the end of each run, you construct a Run like this r = Run(vector_results_from_simulation_run, matrix_result) and then you save the run in the simulation: push!(simulation.runs, r).

8 Likes

@Henrique_Becker @mzaffalon Thank you both for your suggestion. It makes so much sense to have two types rather than having many layers of vectors. I’ll go with this route. Thanks again!! (Too bad I can only make one reply the solution, hope you don’t mind)

2 Likes

:laughing:. No problem. @mzafallon answer was better exemplified than mine, and most of the time people forget to even like the reply :sweat_smile:.

4 Likes