I was wondering if it would be possible to save the compiled GradientTape that is used in ReverseDiff.jl.
Using the gradient example from the package, I’d like to save
compiled_f_tape
# some objective function to work with
f(a, b) = sum(a' * b + a * b')
# pre-record a GradientTape for `f` using inputs of shape 100x100 with Float64 elements
const f_tape = GradientTape(f, (rand(100, 100), rand(100, 100)))
# compile `f_tape` into a more optimized representation
const compiled_f_tape = compile(f_tape)
But just because you can, doesn’t mean you should. The serialization format is not stable across Julia versions, so it should not be relied upon as a file format. And there’s some overhead to reading and writing to file.
The better question to ask is: what are you trying to achieve? Why is it necessary to save to compiled tape? Why not just re-generate it in a new session? Julia is best suited to long-running sessions, not short one-off sessions.
In my particular case, I was trying to use optimize the problem (using the Optim.jl package with the LBFGS method) with different starting points. The tape and compile parts take about 10 hours in my particular problem. I thought saving the compiled tape beforehand might be a good idea so that I could re-use it, but I’m not sure if this is the best way to approach the question. Would you mind sharing some advice on that?
I would take the time to make a minimal working example of what you’re trying to optimize. Don’t give the full 50,000 parameters, and don’t give the full data. Ideally make something parameterized with random data that I can change an input number and scale to different sizes.
We’re more interested in the functional form. Are there any constraints?