Getting strange "UndefVarError: xs not defined" in Zygote / Flux

I’ve got a fairly simple custom 2-layer MLP implementation but I get this strange error when trying to get the gradient:

LoadError: UndefVarError: xs not defined
UndefVarError: xs not defined

Here’s my full code

using Flux
using Zygote
using LinearAlgebra;

mutable struct Layer

layer1 = Layer(randn(784,512),randn(1,512))
layer2 = Layer(randn(512,10),randn(1,10))
layers = [layer1,layer2];

function run_layer(X::Matrix,layer::Layer, cols::Vector{Int64}, rows::Vector{Int64})
    e1 = isempty(cols)
    e2 = isempty(rows)
    bias = e1 ? layer.bias : layer.bias[:,cols]
    theta = e1 ? layer.theta : layer.theta[:,cols]
    theta = e2 ? theta : theta[rows,:]
    y_ = X * theta .+ bias 
    return y_, cols

function model(X::Array,layers::Vector{Layer}, S::Vector)
    A1, c1 = run_layer(X,layers[1],S,Vector{Int64}([]))
    A1 = Flux.normalise(A1;dims=ndims(A1), ϵ=1e-5)
    A1 = NNlib.relu.(A1)
    A2, c2 = run_layer(A1,layers[2],Vector{Int64}([]),c1)
    A2 = Flux.normalise(A2;dims=ndims(A2), ϵ=1e-5)
    A2 = NNlib.softmax(A2,dims=2)
    return A2

lossfn(ŷ::Vector{Float64},y::Vector{Float64}) = -1.0 *ŷ),y)

S = [1,50,90,112,145,240,300,301,500,505]
g = Zygote.gradient(w -> lossfn(vec(model(randn(1,784),w,S)),[1.0,0,0,0,0,0,0,0,0,0]),layers)
# error

I appreciate any guidance on this

I was able to re-write my code to do the same but not get this error, not sure why. I might have been in essence mutating an array and just getting a poor error message for that.

Looks like it was supposed to be a mutation error, but a typo released in v0.6.15 made the error message formatting itself fail. Should be fixed in v0.6.18 and up after

P.S. if you wouldn’t mind some unsolicited suggestions for the code itself, Flux.normalise is the only function that needs to be qualified to use. The rest are all exported by default, e.g. Flux reexports gradient and relu from Zygote and NNlib respectively. With respect to type annotations, the general convention is to use them judiciously (exceptions documented here) so that the model can work with Float32 arrays, GPU arrays, array views etc. without any implementation changes.

Thanks for the explanation and thanks for the code suggestions! I was under the mistaken impression that Julia wanted me to write the most specific type annotations for performance reasons but I see from the docs that is wrong

1 Like