Training Neural ODEs - Advice

Hi,
I am looking for some advice about training neural ODE models. I am looking to train a neural ODE model on a sample of ~25000, 10x4000 matrices. The columns represent time so I was looking to at use an increasing amount of columns/data, like the weather forecasting example from the DiffEqFlux documentation and minibatches.

I am using Julia v1.11.5, Lux and DiffEqFlux to define and train the models. I have access to 20 cores on an Intel i7-1370P or an intel Iris Xe GPU.

My questions are:

  • Given my hardware limitations, would the CPU or GPU be a better option for training the model?
    • Given the GPU, would a compiled Lux model for CPU be the best bet?
  • Can Lux use Julia threads to access multiple CPU cores?
    • I was thinking of an OpenMP type scenario: multiple threads and shared memory.
  • As a model training strategy, does multiple shooting use fewer computational resources than increasing the amount of data?

Thanks

Purely neural ODE, depends on NN size but likely GPU is faster if your hidden layers are at least about 256.

No need, BLAS already will multi thread this by default.

The reason for multiple shooting is not efficiency but numerical stability

1 Like

@ChrisRackauckas - thank you for the response. I appreciate you sharing your knowledge :grinning_face:.