Lifting a Julia function into a Flux "layer"

cscherrer · August 19, 2018, 6:24pm

Ok, so the docs have this example:

opt = SGD([W, b], 0.1) # Gradient descent with learning rate 0.1

opt() # Carry out the update, modifying `W` and `b`.

So let’s try it for the case above:

using Flux
using Flux.Tracker
using Flux.Optimise

W = param(rand(2, 5))
b = param(rand(2))

predict(x) = W*x .+ b
loss(x, y) = sum((predict(x) .- y).^2)

x, y = rand(5), rand(2) # Dummy data

par = Params([W, b])
grads = Tracker.gradient(() -> loss(x, y), par)

opt = SGD(par, 0.1)
opt()

This returns

MethodError: no method matching length(::Params)
Closest candidates are:
  length(!Matched::Core.SimpleVector) at essentials.jl:576
  length(!Matched::Base.MethodList) at reflection.jl:728
  length(!Matched::Core.MethodTable) at reflection.jl:802
  ...

But it doesn’t make sense, anyway - the call uses parameters, but has no reference to the loss function.

It seems that for proper Flux model, [W,b] or params(m) must store the loss function under the hood. But it’s not clear how to connect this for a hand-rolled loss function.

Topic		Replies	Views
Stacking layers example Flux - Flux.params empty New to Julia question	2	826	December 12, 2019
Flux.jl: params() and gradient() ocnfusion Machine Learning	4	632	August 23, 2021
Flux; `params` cannot track any parameters of networks when building networks without `Chain` General Usage flux	6	602	December 31, 2020
Flux Regularisation in Julia at the layer level New to Julia	1	631	January 8, 2020
Understanding Flux.jl use of `gradient` and `params` Machine Learning flux	4	3443	October 2, 2021

Lifting a Julia function into a Flux "layer"

Related topics