Hello, all! I need conceptual assistance with how I should be using a GRU which uses historic data for variables 1:8 and forecasted variables for variables 1:4 to forecast the values for variables 5:8.

I have time series data pertaining to renewable energy generation, day-ahead cost, and single system price in 30-minute intervals over about 6 weeks. I want to predict all four of these variables for the next T+h periods. Additionally, I have brought in basic weather data which I am treating as historic data up to time T and forecasted values for T+h.

E_{T+1:T+h} = m(W_{1:T}, W_{T+1:T+h}, E_{1:T})

- E_t: Energy variables at time t. (Solar, Wind, DAP, SSP)
- W_t: Weather variables at time t. (Temperature, Cloud Cover, Windspeed, Wind Direction)
- T: End of historic data
- h: forecast horizon

```
# faking up some data.
# historical and forecasted variables
data = DataFrame(
timestamp_utc=DateTime(today()-Day(45)):Minute(30):DateTime(today())
)
# weather data (historical and forecasted))
data.temperature = rand(Float32, size(data, 1))
data.cloudcover = rand(Float32, size(data, 1))
data.windspeed = rand(Float32, size(data, 1))
data.winddir = rand(Float32, size(data, 1))
# provided (historical) variables (TARGET)
data.Solar = rand(Float32, size(data, 1))
data.Wind = rand(Float32, size(data, 1))
data.DAP = rand(Float32, size(data, 1))
data.SSP = rand(Float32, size(data, 1))
```

Then I create a GRU model from Flux with:

```
m = Chain(
GRU(8 => 4)
)
```

When I call `m`

on an 8 \times n `Matrix`

I get a 4 \times n Matrix. This seems like this model is supposed to be used to predict other variable(s) for the same n periods. But from the reading I’ve done it seems like I should be able to make predictions about the future, not just make "co-forecasts’ if that term makes sense.

```
Flux.reset!(m)
m(data[1:72, 2:end] |> Matrix |> transpose)
```

Do I need to hardcode the training horizon? So if h=1\ day then I should add a `Dense`

layer which would be h=48 and then take only the first 4 columns? This does not seem like the most direct and correct approach, but it does provide a useful shape and I suppose the parameters would be still be optimized?

```
m = Chain(
GRU(8 => 4),
Dense(4 => 48),
x -> x[:, 1:4]
)
```

Also, even if the above is the case, I am still unclear of how to use both the historic data. Should I be working with two models like:

- m(f(H), F) where F is forecasted data and f(H) is the output of the model on historic data?

Ultimately, I plan to pass this model to `ConformalModels.jl`

to obtain probabilistic forecasts. If anybody knows a reason that wouldn’t work well please lmk!

Thanks!