Flux: Custom Layer

jmurray · June 15, 2020, 11:26pm

In Flux, I need to create a custom layer (named Nonneg) and then train the model with Train. I tried replicating the Dense layer, following @LudiWin’s example and looking at the definition of the Dense layer.

When I try running this, I get a “no method matching” error originating in Train; the closest candidate is said to be function (a::Nonneg)(x::AbstractArray):

MethodError: no method matching (::Nonneg{typeof(identity),Array{Float64,2},Array{Float64,1}})(::Float64)
Closest candidates are:
  Any(!Matched::AbstractArray) at In[9]:44

Stacktrace:
 [1] macro expansion at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0 [inlined]
 [2] _pullback(::Zygote.Context, ::Nonneg{typeof(identity),Array{Float64,2},Array{Float64,1}}, ::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:7
 [3] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [4] _pullback(::Zygote.Context, ::typeof(Flux.applychain), ::Tuple{Nonneg{typeof(identity),Array{Float64,2},Array{Float64,1}}}, ::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [5] Chain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:38 [inlined]
 [6] _pullback(::Zygote.Context, ::Chain{Tuple{Nonneg{typeof(identity),Array{Float64,2},Array{Float64,1}}}}, ::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [7] loss at .\In[9]:63 [inlined]
 [8] _pullback(::Zygote.Context, ::typeof(loss), ::Float64, ::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [9] adjoint at C:\Users\username\.julia\packages\Zygote\YeCEW\src\lib\lib.jl:179 [inlined]
 [10] _pullback at C:\Users\username\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:47 [inlined]
 [11] #17 at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:89 [inlined]
 [12] _pullback(::Zygote.Context, ::Flux.Optimise.var"#17#25"{typeof(loss),Tuple{Float64,Float64}}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [13] pullback(::Function, ::Zygote.Params) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:174
 [14] gradient(::Function, ::Zygote.Params) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:54
 [15] macro expansion at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:88 [inlined]
 [16] macro expansion at C:\Users\username\.julia\packages\Juno\f8hj2\src\progress.jl:134 [inlined]
 [17] train!(::typeof(loss), ::Zygote.Params, ::Array{Tuple{Float64,Float64},2}, ::ADAM; cb::typeof(evalcb)) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:81
 [18] top-level scope at .\In[9]:66

What am I doing wrong? The code is below.

using Plots
using Distributions
using Flux
using Flux: mse, @treelike

num_samples = 50
x_noise_std = 0.01
y_noise_std = 0.1

function generate_linear_data()
    x = reshape(range(0, stop=4π, length=num_samples), num_samples, 1)
    y_noise = rand(Normal(0,y_noise_std), num_samples)
    
    y = sin.(x).^2 + y_noise
    
    x = transpose(x)
    y = transpose(y)
    
    return x, y
end

X, Y = generate_linear_data() # Training data of shape (1,50)

struct Nonneg{F,S<:AbstractArray,T<:AbstractArray}
    W::S
    b::T
    σ::F
end

Nonneg(W, b) = Nonneg(W, b, identity)

function Nonneg(in::Integer, out::Integer, σ=identity) # tanh
    return Nonneg(randn(out, in), randn(out), σ)
end

Flux.@functor Nonneg  # makes trainable

function (a::Nonneg)(x::AbstractArray)
    # Later:
    # offset = min(0, minimum(x[:]))
    # a.σ(a.W * (x .- offset) .+ a.b) 
    a.σ(a.W * x .+ a.b)
end

# @treelike Nonneg # some say to use @treelike, but it's not used in the Flux definition of Dense

layer = Nonneg(1, 1) # compare to Dense(1, 1)

LossLog = []
LossLog_T = []
function evalcb()
    loss_value = loss(X, Y)
    push!(LossLog,loss_value)
    push!(LossLog_T,length(LossLog))
    @show([length(LossLog), loss_value])
end

m = Chain(layer) # later: Chain(Dense(1, 10), Dense(10,1), layer)
opt = ADAM()
dataset = [z for z in zip(X, Y)]
loss(x, y) = mse(m(x), y)

for idx = 1 : 100
    Flux.train!(loss, Flux.params(m), dataset, opt; cb=evalcb)
end

scatter([transpose(X) transpose(X)], [Transpose(Y) m(X)], layout=(1,1))
println(loss(X, Y))

I’ve tried even replacing layer = Nonneg(1, 1) with layer = Dense(1, 1) but that leads to a similar error.

MethodError: no method matching (::Dense{typeof(identity),Array{Float32,2},Array{Float32,1}})(::Float64)
Closest candidates are:
  Any(!Matched::AbstractArray{T,N} where N) where {T<:Union{Float32, Float64}, W<:(AbstractArray{T,N} where N)} at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:133
  Any(!Matched::AbstractArray{#s107,N} where N where #s107<:AbstractFloat) where {T<:Union{Float32, Float64}, W<:(AbstractArray{T,N} where N)} at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:136
  Any(!Matched::AbstractArray) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:121

contradict · June 16, 2020, 12:03am

Your new layer and the Flux Dense layer are both defined to operate on ::AbstractArray, however your data generating and batching creates tuples of ::Float64, not tuples of ::AbstractArray. For this example, you can

dataset = [([a], [b]) for (a,b) in zip(x, y)]

You should also check out Flux.DataLoader.

jmurray · June 16, 2020, 5:27am

@contradict – Thank you! That makes sense. And when I made the change you recommended, it worked perfectly! I’ll look into your Flux.Dataloader, as you suggested.

I’m working on further customizing the layer to apply different activation functions to different outputs; it may be of interest to others like me who are new to this, so I’ll add it to this thread – along with a question or two – hopefully tomorrow.

jmurray · June 25, 2020, 12:42am

For completeness, here is the working code. I added a switch use_nonneg to use the custom layer (with a working nonnegativity constraint) or a standard Dense layer. I also cleaned up the output. Thanks again for the help!

using Plots
using Distributions
using Flux
using Flux: mse, @treelike

#
##### GENERATE DATA #########
#
num_samples = 50
x_noise_std = 0.01
y_noise_std = 0.1

function generate_data()
    x = reshape(range(-π/2, stop=π/2, length=num_samples), num_samples, 1)
    y_noise = rand(Normal(0,y_noise_std), num_samples)
    y = sin.(x).^2 .- 0.25 .+ y_noise
    
    return x', y'
end

X, Y = generate_data() # Training data of shape (1,50)

#
##### CUSTOM LAYER #########
#
struct Nonneg{F,S<:AbstractArray,T<:AbstractArray}
    W::S
    b::T
    σ::F
end

Nonneg(W, b) = Nonneg(W, b, identity)

# Default activation function softplus keeps output non-negative without depressing fits to peaks
function Nonneg(in::Integer, out::Integer, σ=softplus) 
    return Nonneg(randn(out, in), randn(out), σ)
end

Flux.@functor Nonneg  # makes trainable

function (a::Nonneg)(x::AbstractArray)
    a.σ.(a.W * x .+ a.b)
end

# @treelike Nonneg # some say to use @treelike, but it's not used in the Flux definition of Dense

#
##### CALLBACK & PLOTS #########
#
LossLog = []
LossLog_T = []
function evalcb()
    loss_value = loss(X, Y)
    push!(LossLog,loss_value)
    push!(LossLog_T,length(LossLog))
    if mod(length(LossLog),500)==1
        update_loss_plot()
    end
end
    
function update_loss_plot()
    p_loss = plot(LossLog_T, LossLog, ylabel="Loss", xlabel="Index", yscale=:log10, legend=false)
    IJulia.clear_output(true)
    display(p_loss)
    return p_loss
end

function plot_with_fit(x, y, yfit, label)
    return plot([x x], [y yfit]; color=[:black :red],lw=[0 2], marker=[:circle :none], label=["Data" "Fit"], legend=:top, ylabel="Data & Fit")
end

#
##### MODEL / TRAINING ###############
#
use_nonneg = true # use custom (non-negativity) layer or Dense?

n = 10 # neurons in hidden layers
layer = use_nonneg ? Nonneg(n, 1) : Dense(n, 1)

m = Chain(Dense(1,n,tanh),Dense(10,n,tanh),layer) #Chain(layer)

opt = ADAM()
dataset = [([a], [b]) for (a,b) in zip(X, Y)]
loss(x, y) = mse(m(x), y)

for idx = 1 : 100
    Flux.train!(loss, Flux.params(m), dataset, opt; cb=evalcb)
end
p_loss = update_loss_plot() #final update
p_fit = plot_with_fit(X', Y', m(X)', "Data & Fit")
IJulia.clear_output(true)
plot(p_loss, p_fit,layout=(2,1))

Topic		Replies	Views
FlexLayer: A Custom Layer with Different Activation Fcns, Non-negativity, and more New to Julia flux	0	574	July 3, 2020
Flux.jl MethodError ... (::Dense{typeof(identity),... on simple Chain model Machine Learning	3	1165	October 17, 2019
FluxML Basic Custom Layer with Custom Loss Function Machine Learning flux	3	1851	March 19, 2019
Very basic Flux problem Machine Learning	1	1557	June 25, 2019
How to use Flux.train! to train custom layer? Machine Learning question	2	1838	September 2, 2019

Flux: Custom Layer

Related topics