I am constructing a custom layer, and have an error:
ERROR: MethodError: no method matching iterate(::Type{Vector{Float32}})
Closest candidates are:
iterate(::Union{LinRange, StepRangeLen}) at range.jl:872
iterate(::Union{LinRange, StepRangeLen}, ::Integer) at range.jl:872
iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
...
Stacktrace:
[1] Polylayer(coefs::Vector{Float32})
@ Main ~/src/2022/basic_UODE/custom_lux_layer/poly_layer.jl:13
[2] top-level scope
@ ~/src/2022/basic_UODE/custom_lux_layer/poly_layer.jl:17
which should be easy to decipher, but I can’t figure it out. I created a MWE, which is:
using Lux
struct Polylayer{C} <: Lux.AbstractExplicitLayer
coefs::C
end
function Polylayer(coefs)
dtype = (typeof(coefs))
return Polylayer{dtype...}(coefs)
end
a = Vector{Float32}(undef, 4)
pol = Polylayer(a)
Yes, I know there are functions missing. But since I am only testing the layer by calling the functions manually, I did not think the added functions were necessary. I also created a show function, thinking that would do the trick. Finally, I removed other functionality to create a MWE.
I want to construct a custom layer that implements a polynomial of specified degree for a pair of variables. So I created the structure:
struct Polylayer{C} <: Lux.AbstractExplicitLayer
in_dims::Int
out_dims::Int
degree::Int
coefs::C
end
and the following constructors:
function Polylayer(in_dims::Int, out_dims::Int, coefs)
degree = length(coefs)
dtype = (typeof(coefs))
return Polylayer{dtype...}(in_dims, out_dims, degree, coefs)
end
function Polylayer(in_dims::Int, out_dims::Int, degree::Int)
coefs = zeros(degree)
dtype = (typeof(coefs))
#println(coefs, typeof(coefs))
return Polylayer{dtype...}(in_dims, out_dims, degree, coefs)
end
I am modeling this approach on the Dense layer found in the Lux.jl source code. I looked for examples of custom layers for Lux.jl online and there are very few of them at this time.
I know I must implement initialparameters, parameterlength and statelength.
The layer will receive one scalar variable, and output the evaluation of the polynomial Polylayer(x, p, st). My objective is to have Lux train the polynomial coefficients. The larger experiment is an execution of the LV equations where I express the nonlinear term directly as a polynomial and use UODE to find the coefficients. This is a test example for a future larger problem.
I implemented a more complete layer, and it does generate errors.
struct Polylayer{C,F} <: Lux.AbstractExplicitLayer
in_dims::Int
out_dims::Int
degree::Int
coefs::C
init_weight::F
end
function Base.show(io::IO, d::Polylayer)
println(io, "Polylayer($(d.in_dims) => $(d.out_dims), degree: $(d.degree)")
println("inside show")
end
function Polylayer(in_dims::Int, out_dims::Int, coefs, init_weight=Lux.rand32)
degree = length(coefs)
dtype = (typeof(coefs), typeof(init_weight))
return Polylayer{dtype...}(in_dims, out_dims, degree, coefs, init_weight)
end
function initialparameters(rng::AbstractRNG, d::Polylayer)
return (d.init_weight(rng, d.degree+1))
end
statelength(d::Polylayer) = 0
function parameterlength(d::Polylayer)
return d.degree + 1
end
I just did not think that the additional two functions were necessary, even for minimal testing.
Rather, I thought the additional functions would be necessary when actually using the layer within a chain.
OK, I now understand. No need to include the weight arrays in the struct. Rather when I call Lux.setup, the parameters are initialized.
I must say that I still have difficulty wrapping my head around the Julia way of coding. Yes, Julia is extremely efficient, but the approach to coding takes lots of getting used to. I am not convinced it is faster than conventional languages. So far, the development process (for me) is slower. But of course, that is probably due to the many years of coding in Fortran, C++, Python.
Here is my next to final custom layer. All seems to work except the very last Lux.setup(..) call. Note that if I call initialparameters() directly, the function returns a parameterized array. Strange. Here is the code:
using Lux
using Random
"""
Apply a polynomial to an input Float
The coefficients are trainable.
First attempt: polynomial of a specified `degree`
"""
struct Polylayer{F} <: Lux.AbstractExplicitLayer
in_dims::Int
out_dims::Int
degree::Int
init_weight::F
end
function Base.show(io::IO, d::Polylayer)
println(io, "Polylayer($(d.in_dims) => $(d.out_dims), degree: $(d.degree)")
println("inside show")
end
function Polylayer(in_dims::Int, out_dims::Int, degree::Int, init_weight=Lux.rand32)
dtypes = (typeof(init_weight))
return Polylayer{dtypes}(in_dims, out_dims, degree, init_weight)
end
function initialparameters(rng::AbstractRNG, d::Polylayer)
return (d.init_weight(rng, d.degree+1))
end
statelength(d::Polylayer) = 0
function parameterlength(d::Polylayer)
return d.degree + 1
end
# ====================================
rng = Random.default_rng()
model = l.Polylayer(5, 3, 3)
ps, st = Lux.setup(rng, model)
ps # I get an empty parameter list
Very true, @ToucheSir ! Intellectually I know how it works, but out of habit, I return to the standard way of doing things. I tried learning Jax recently but have not put in the time. I am a Pytorch person myself.
function initialparameters(rng::AbstractRNG, d::Polylayer)
return (coeffs = d.init_weight(rng, d.degree+1),)
end
Lux.initialstates(::AbstractRNG, ::Polylayer) = NamedTuple()
function (l::Polylayer)(x::AbstractMatrix, ps, st::NamedTuple)
y = ps.coeffs *. x
... # do whatever your layer is
end
Thanks, but I still get empty arrays from Lux.setup.
I noticed you added a comma in initial parameters, but it had no effect. I have no idea there is a tiny mistake somewhere. Julia is rather unforgivable, as is any compiled language.
So make sure we are looking at the same code, I reattach it:
sing Lux
using Random
"""
Apply a polynomial to an input Float
The coefficients are trainable.
First attempt: polynomial of a specified `degree`
"""
struct Polylayer{F} <: Lux.AbstractExplicitLayer
out_dims::Int
degree::Int
init_weight::F
end
function Base.show(io::IO, d::Polylayer)
println(io, "Polylayer(out_dims: $(d.out_dims)), degree: $(d.degree)")
end
function Polylayer(out_dims::Int, degree::Int, init_weight=Lux.rand32)
dtypes = (typeof(init_weight))
return Polylayer{dtypes}(out_dims, degree, init_weight)
end
function (l::Polylayer)(x::AbstractMatrix, ps, st::NamedTuple)
y = x
return y, st
end
function initialparameters(rng::AbstractRNG, d::Polylayer)
return (coeffs=d.init_weight(rng, d.degree+1),)
end
Lux.initialstates(::AbstractRNG, ::Polylayer) = NamedTuple()
statelength(d::Polylayer) = 0
function parameterlength(d::Polylayer)
return d.degree + 1
end
# ====================================
rng = Random.default_rng()
Random.seed!(rng, 0)
model = Polylayer(5, 3, Lux.rand32)
ps, st = Lux.setup(rng, model) # <<<<<< produces two empty NamedTuples.
initialparameters(rng, model) # Initializes correctly.