I have the following custom layer.
struct Activation{F}
f::F
Activation(f::Function) = new{typeof(f)}(f)
end
(m::Activation)(x::AbstractArray) = m.f.(x)
I test whether it allows a model to be trained using the following code
using Flux
function test_training(model,x,y)
opt = Descent(0.1)
loss = Flux.Losses.mse
losses = Vector{Float32}(undef,2)
for i = 1:2
local loss_val
ps = Flux.Params(Flux.params(model))
gs = gradient(ps) do
predicted = model(x)
loss_val = loss(predicted,y)
end
losses[i] = loss_val
Flux.Optimise.update!(opt,ps,gs)
end
if losses[1]==losses[2]
error("Parameters not updating.")
end
return nothing
end
x = ones(Float32,4,4,1,1)
y = ones(Float32,4,4,2,1)
model = Chain(Conv((3, 3), 1=>2,pad=SamePad()),Activation(tanh))
test_training(model,x,y)
The model does get trained on my machine (Windows), but for some reasons fails on all operating systems when tested in Github Actions.
The following type unstable variant of the custom layer does work in Github Actions.
struct Activation
f::Function
end
(m::Activation)(x::AbstractArray) = m.f.(x)
I checked Julia and Flux versions on Github Actions and they are the same as on my machine.
Does anyone have any ideas on what is happening here? A comment on whether it works on your machine or not also helps.
EDIT:
Changing opt = Descent(0.1)
to opt = ADAM()
allows checks to pass. However, still do not know why it fails when using Descent
.