Flux opt_state question

Daptine · May 12, 2024, 2:35am

According to the document,

# Initialise the optimiser for this model:
opt_state = Flux.setup(rule, model)

for data in train_set
  # Unpack this element (for supervised training):
  input, label = data

  # Calculate the gradient of the objective
  # with respect to the parameters within the model:
  grads = Flux.gradient(model) do m
      result = m(input)
      loss(result, label)
  end

  # Update the parameters so as to reduce the objective,
  # according the chosen optimisation rule:
  Flux.update!(opt_state, model, grads[1])
end

will compute the gradient of loss function.

Does this mean that if I don’t set up opt_state, no tracks will be recorded just as if
with torch.no_grads() is applied in Pytorch, so no waste of computation in case no gradient is required?

CarloLucibello · May 12, 2024, 4:32pm

Zygote (which is the automatic differentiation engine underlying Flux) works differently from pytorch’s autograd: it doesn’t keep a tape. If you want to compute gradients, you call Flux.gradient, if you don’t want to compute gradient you just call the function.
The optimizer has nothing to do with it, the optimizer just handle how the gradient should be used to update the model.

Daptine · May 13, 2024, 2:53am

Thanks for the clarification!

Topic		Replies	Views
Flux.gradient returns dict of param and Nothing Machine Learning flux	3	791	September 22, 2021
Request to be deleted General Usage diffeq , flux	4	736	July 4, 2020
Lifting a Julia function into a Flux "layer" Machine Learning flux	7	2241	May 29, 2019
Best practice for Flux.jl: how to untrack gradient? General Usage optimization , machine-learning	1	570	March 15, 2021
[ANN] Flux v0.5 Machine Learning	11	2169	March 15, 2018

Flux opt_state question

Related topics