suppose I have a 2x3 array called mat1.
I would like to pretend that this 2x3 array is actually a 2x3x4 array called mat1_3d_representation, such that
mat1_3d_representation[i, j, k] == mat1[i, j]
for all valid scalar i, j, k. And,
mat1_3d_representation[i, j, :] == repeat([mat1[i,j], 4)
Context:
variables are stored at (arbitrary) different units of observation: E.g. person-time, person-firm-time-transaction, time, firm-person-time, etc.
In a model, it might make sense to add (or multiply, or etc.) a variable v_t at one unit of observation (say, time) to a variable x_i_t_j at a different (superset) unit of observation (say, time-person-firm) (assume that I have good reasons not to be using joins of dataframes or joins in SQL). These variables are represented as multidimensional arrays where indices correspond to integer representation of the units of observation.
I could repeat v_t I * J times and then add the matrices, but this shouldn’t be necessary. Any operation defined on two arrays of size (a, b, c, …) can be redefined on an array of size (a, b, c, …) and an array of same size but missing some dimensions. A type that represents an array at the higher dimension would eliminate ambiguity about what is meant by the operations and would allow multiple dispatch to apply the correct operations.
Maybe there is a simple way to unambiguously apply broadcasting here? Broadcasting works in special cases. Combining it with TransmuteDims.transmute can extend it to other cases, but it can get very messy, which indicates another layer of abstraction would be beneficial.
E.g.:
obs = I, J, T = 2, 3, 4 # observations
x_i_j_t = ones(Float64, obs)
w_i = 1:I |> collect
h_i_t = 1:(I*T) |> collect |> v -> reshape(v, (I, T))
v_t = 1:T |> collect
x_i_j_t .+ w_i # correct
# x_i_j_t .+ v_t # doesn't work.
using TransmuteDims
x_t_i_j = transmute(x_i_j_t, (3, 1, 2))
x_t_i_j .+ v_t # works
# x_i_j_t .+ h_i_t
x_i_t_j = transmute(x_i_j_t, (1, 3, 2))
x_i_t_j .+ h_i_t # works
βₓ, β_w, βᵥ, βₕ = .2, .3, .4, .5
# y_i_j_t = @. βₓ * x_i_j_t + β_w*w_i + βᵥ*v_t + βₕ*h_i_t # lol, no... though wouldn't it be great if I could represent w_i, v_t, and h_i_t as though they were all of (i, j, t) and this equation worked?
# to use transmute and broadcasting to evaluate y_i_j_t without an additional abstraction layer, I would need to call transmute multiple times and it would be difficult to read.
Basically I am hoping that there exists a package that provides a layer of abstraction so I can avoid the above mess. This is totally feasible; if it doesn’t exist I will write functions that execute the above in a more organized fashion, but I’d rather rely on existing packages to the extent that I can.
Does anyone know of existing packages that allow a multiple-dimensional array to be represented as though it were of higher dimension?