I’m trying to understand the performance bottlenecks in training universal differential equations. Following this guide, benchmarking the training shows that optimizing the parameters of the Lux network requires a large number of allocations:
@btime res1 = Optimization.solve(optprob, ADAM(), maxiters = 100) 417.338 ms (2205218 allocations: 201.85 MiB)
I’m going to run a larger number of similar optimization and would like to make reduce time to solution for the optimization problem. My problems are low dimensional and I would need to optimize for a large number of given datasets.
Does anyone know how to minimize the number of allocations?
I’m thinking about either using StaticArrays, like described here , but am unsure whether they work with
Lux. Another thing I’d like to look into is using SimpleChains but I haven’t figured out yet how to make their interface compatible with Optimization.
Does anyone have experience with optimizing for this situation?