Multilayer perceptron with time dependent multidimensional data using Flux

Hi, I have some problems coding a ( multilayer perceptron using Flux. My data is given in the form

D_t = (x_i, yi)_{i=1}^n where x_i is a scalar and y_i a $3-$dim vector and t=1,\ldots,T the timesteps. One shold notice that (x_1, y_1) in D_1 and (x_1, y_1) in D_2 correspond to each other i.e. is is the data point for the next time step. For one time-step T=1, I constructed the data for my neural network by the following:

using Flux
n = 10
nsamples = n; in_features = 1; out_features = 3;
X = randn(Float32, in_features, nsamples);
Y = randn(Float32, out_features, nsamples);
data = Flux.Data.DataLoader((X,Y), batchsize=4); # automatic batching convenience
X1, Y1 = first(data); 
@show size(X1) size(Y1)

Now I want to extend the network for T>1. My first idea was to reshape the data but I could not really find a way to extract the results later on. My other idea was to in crease the input and output features. Therefore I used

in_features = T

But for the output features I could not really find a way, since in my case it symbolizes the dimension. So actually I would need something like
out_features = (T, 3)

But I am pretty sure that this is not working. Can someone tell me a better strategy to deal with these kind of problems?

1 Like

It’s hard from the mathematical description to discern what the actual desired data shapes are. Pseudocode and/or descriptions of each dimension would be far more helpful.

One thing that might help you: Flux’s Dense layer can take multiple dimensions between the input and batch. This is useful if you can apply the MLP over each timestep simultaneously.

I see, let me start to give an example of one specific data point in the dataset.

Example: Assuming two different input features a scalar height and a scalar weight, my output is a 3-dim vector dependin on time. One could write this data point (x_1, y_1(t)) as:

x_1 = (height_1, weight_1) and y_1(t) = (t, t^2, t^3) with t = 1, \ldots, T

Another way to represent the target would be

y_1 = [(1, 1, 1), (2, 4, 8),\ldots,(T, T^2, T^3)].

Of course this is only one data point in a dataset of n samples. I wrote it in a more general form in Julia

nsamples = 10
input_features = 2 # number of input features e.g. (height, weight)
output_features = 3 # dim of the vector e.g. (t, t^2, t^3)
T = 20 # timesteps

# One specific data point 
x1 = randn(Float32, input_features)
y1 = randn(Float32, T, output_features)

@show x1 y1

# Data set i.e. input X and target Y contains nsamples of these data points
X = randn(Float32, input_features, nsamples)
Y = randn(Float32, T, output_features, nsamples)

@show X Y

I’m not exactly sure I understand then. The input has no time dimension, but then the model adds one based on the height and weight? That’s definitely beyond what a plain MLP is capable of, so I assume you have a more complex model than originally described.