Flux custom layers - incorrect solution from training

lwhitefox · April 3, 2020, 11:21pm

My real goal is to include layers with matrices computed from known physics solutions; the training will then determine physical parameters appearing in the equations used to compute the matrix elements.

To get started, I created a toy problem with a 2x2 matrix. The equation generating matrix terms is:

function w(x::AbstractArray,y::Number)
  [ 3x[1]*y 2x[1]*x[1] ; -1.25y*y 2.5*x[1]*x[1]*y]
end

In the toy example, I only want to estimate the value of x, so it is set up as a 1 element vector, while y is left as a Number.

I compute a “true solution” (training data) using x=[1], y=2, and initialize a model with a value x_guess=[1.2] as a single layer network, noting that the input to the network is a constant vector [1.;1.].

The model runs, the loss function works correctly, but the solution always converges to the same wrong answer, about 1.38.

The complete working example code is below, including some plotting lines that show that the error function has a minimum at x=1 as expected.

Does anyone have ideas why the network “training” does not get the right answer (i.e., spot the bug(s), sadly)? I’ve been staring at this but must be missing something; it should be able to get to zero error at x=1, but converges to the value x=1.38 with an error of 18.904071388502857.

using Flux

# trivial layer
struct mylayer{X <: AbstractArray, Y <: Number}
  x::X
  y::Y
end
# function that generates component of weight matrix
function w(x::AbstractArray,y::Number)
  [ 3x[1]*y 2x[1]*x[1] ; -1.25y*y 2.5*x[1]*x[1]*y]
end
function (a::mylayer)(vec::AbstractArray)
  x,y = a.x, a.y
  wmat = w(x,y)
  wmat*vec
end
# jazz  it up with functor
Flux.@functor mylayer

#  we want the true answer to be for x=y=1 - "input" is const [1.;1.]
y_true = w([1.0],2.0)*[1.0;1.0]

# now set up network configuration stuff
function loss(xx,yy)
  ŷ = m([1.0,1.0])
  Flux.mse(ŷ,yy)
end
opt = Descent(0.0005)  # need to damp this or it blows up
# initialize model with [x]=1.2, y=2.0
x_guess = [1.02]
m = Chain(mylayer(x_guess,2.0))
# now train:
Flux.@epochs 15 Flux.train!(loss, params(m), zip(x_guess,y_true), opt,  cb = () -> println("loss=$(loss([1.],y_true))  x=$(params(m))"))



# plot error function to show it is right, with minimum at [x]=[1.0]
using Plots
xvals = [x for x in 0:0.1:2.0]
err = [Flux.mse(w([x],2)*[1.;1.], y_true) for x in xvals]
Plots.plot(xvals,err)

Topic		Replies	Views
FluxML Basic Custom Layer with Custom Loss Function Machine Learning flux	3	1825	March 19, 2019
Custom Flux.jl Layer Not Updating (problem with Flux.trainable?) Machine Learning flux	2	223	October 19, 2024
Differentiating the Custom Layer and Regularizer New to Julia flux	7	552	June 14, 2022
Flux with Matrix input dimensions Machine Learning flux	2	1288	March 12, 2023
Incorrect (implicit) gradients for a custom model in Flux Machine Learning question , flux	3	434	March 12, 2021

Flux custom layers - incorrect solution from training

Related topics