# FluxML Basic Custom Layer with Custom Loss Function

Hi,

I want to write a custom layer and a custom loss function and thought I’d start out with the basic linear regression problem from the tutorial of Flux and do the SGD from scratch for a custom layer in a Chain.
For the “custom” layer", I copied the source code from Dense from Flux and simply renamed it Linear.

The problem is that the loss goes towards infinity and I don’t really know why.
Any help is greatly appreciated.

Thank you in advance to anybody who is taking the time of reading this.

Below is the super basic code.

``````clearconsole()
println("LinearRegression.jl")

using Flux, Flux.Tracker
using Distributions
using Plots

num_samples = 50
x_noise_std = 0.01
y_noise_std = 0.5

function generate_linear_data()
x = reshape(range(-1, stop=1, length=num_samples),num_samples,1)
x_noise = rand(Normal(0,x_noise_std), num_samples)
y_noise = rand(Normal(0,y_noise_std), num_samples)

y = 1 .* x .+ 3 .+ y_noise

if false
display(scatter(x, y))
error("Exited in function generate_linear_data()")
end

x = reshape(x, 1, num_samples)
y = reshape(y, 1, num_samples)

return x, y
end

X, Y = generate_linear_data() # Training data of shape (1,50)

# Copied code from Dense layer and simply renamed it
struct Linear{F,S,T}
W::S
b::T
σ::F
end

Linear(W, b) = Linear(W, b, identity)

function Linear(in::Integer, out::Integer, σ = identity)
return Linear(param(randn(out, in)), param(zeros(out)), σ)
end

Flux.@treelike Linear

function (a::Linear)(x::AbstractArray)
W, b, σ = a.W, a.b, a.σ
σ.(W*x .+ b)
end

layer = Linear(1, 1)
# layer = Flux.Dense(1,1)

criterion(x, y) = mean((model(x) .- y).^2)
model = Chain(layer)
θ = Flux.params(model)
opt = Flux.Descent(0.01)

for itr=1:100

pred = layer(X) # Full batch training with size(X)=(1,50)
loss = criterion(pred,Y)
println(loss)
for p in θ
# println("p ", p)
end
end

ŷ = Tracker.data(model(X))
scatter([transpose(X) transpose(X)], [transpose(Y) transpose(ŷ)], layout=(2,1))

``````
1 Like

I haven’t run your code yet (I will when back in front of my laptop), but anytime your loss goes to infinity, you should ask yourself if you’re missing a negative sign in gradient updates. Try `update!(opt, p, -grads[p])`. This also changed recently on the latest Flux, which might explain what’s going on.

3 Likes

In the current implementation for

a minus sign is missing. So you need to put it by hand as @jpsamaroo suggested.

1 Like

I found my mistake:
I defined “criterion(x,y) = mean(model(x) .- y).^2)” was the culprit.
I did two forward passes through the linear layer while computing the loss functions.

Down below is the working code:

``````clearconsole()
println("LinearRegression.jl")

using Flux, Flux.Tracker
using Distributions
using Plots

num_samples = 50
x_noise_std = 0.01
y_noise_std = 0.25

function generate_linear_data()
x = reshape(range(-1, stop=1, length=num_samples),num_samples,1)
x_noise = rand(Normal(0,x_noise_std), num_samples)
y_noise = rand(Normal(0,y_noise_std), num_samples)

y = 3 .* x .+ y_noise #.+ 3

if false
display(scatter(x, y))
error("Exited in function generate_linear_data()")
end

x = transpose(x)
y = transpose(y)

return x, y
end

X, Y = generate_linear_data() # Training data of shape (1,50)

# Copied code from Dense layer and simply renamed it
struct Linear{F,S,T}
W::S
b::T
σ::F
end

Linear(W, b) = Linear(W, b, identity)

function Linear(in::Integer, out::Integer, σ = identity)
return Linear(param(randn(out, in)), param(randn(out)), σ)
end

Flux.@treelike Linear

function (a::Linear)(x::AbstractArray)
W, b, σ = a.W, a.b, a.σ
return (W*x .+ b)
end

layer = Linear(1, 1)
# layer = Flux.Dense(1,1)

model = Chain(layer)
criterion(x, y) = mean((x .- y).^2)
θ = Flux.params(model)
opt = Flux.Descent(-0.1)
println("θ ", θ)
for itr=1:300

pred = model(X) # Full batch training with size(X)=(1,50)
loss = criterion(pred,Y)
# println(loss)
for p in θ
# println("p ", p)