Mutating arrays not supported

I want to change the output in a custom Flux layer, but when I try to do so, I get a “Mutating arrays not supported” error. I’ve tried playing with things a bit and while I can act on the array as a whole (element-wise), I can’t act individually on any elements. (Declaring the struct as mutable doesn’t help.)

The code is below, with the error message following. Look inside the custom_layer function; I have a few examples of what works and what doesn’t. I need to be able to change individual elements of the array. I’m very much stuck, so I appreciate any suggestions.

using Flux.Data: DataLoader
using Flux: Params #Zygote

##### DATA #####################
num_samples = 50
x_noise_std = 0.01
y_noise_std = 0.1
function generate_data()
    x = reshape(range(0, stop=1.5π, length=num_samples), num_samples, 1)
    y_noise = rand(Normal(0,y_noise_std), 3, num_samples)
    
    # 3 outputs
    y = hcat(
        sin.(x).^2 .-0.3 .+ y_noise[1,:],
        cos.(x) .+ y_noise[2,:],
        cos.(2x).*sin.(x) .+ y_noise[3,:]
    )
    
    x = x'
    y = y'
    
    return x, y
end
X, Y = generate_data() # Training data of shape (3,num_samples)

##### CALLBACK & PLOTS #################
LossLog = Array{Float64, 1}()
LossLog_T = Array{Int, 1}()
function evalcb()
    global LossLog
    global LossLog_T
    loss_value = loss(X, Y)
    push!(LossLog, loss_value)
    push!(LossLog_T, length(LossLog))
    if mod(length(LossLog_T),100)==1
        @show LossLog[end]
    end
end

#
##### CUSTOM LAYER #########
#
mutable struct custom_layer{F,S<:AbstractArray,T<:AbstractArray}
    W::S
    b::T
    σ::F
end

#custom_layer(W, b) = custom_layer(W, b, identity)

function custom_layer(in::Integer, out::Integer, σ=softplus) 
    return custom_layer(randn(out, in), randn(out), σ)
end

Flux.@functor custom_layer  # makes trainable

function (a::custom_layer)(x::AbstractArray)
    #x_out = a.σ.(a.W * x .+ a.b)
    x_out = a.W * x .+ a.b
    
    # x_out = x_out.*2.0 # SUCCESS (trivial)
    # x_out = tanh.(x_out) # SUCCESS (trivial)
    x_out[1] = 1.0 # FAILS
    # x_out = map!(tanh, x_out, x_out) # FAILS
    
    # FAILS:
    #= f = [sin, cos, tan]
     for i in 1:size(x_out, 1)
        x_out = view(x_out, i, :)
        map!(f[i], x_out, x_out)
    end =# 

    return x_out
end

# @treelike custom_layer # some say to use @treelike, but it's not used in the Flux definition of Dense


##### MODEL & TRAINING #####################
n = 10
act_f = tanh
last_layer = custom_layer(n, 3) #Dense(n, size(Y,1))
m = Chain(Dense(size(X, 1), n, act_f), Dense(n, n, act_f), Dense(n, n, act_f), last_layer)
opt = ADAM()
loss(x, y) = mse(m(x), y)
train_loader = DataLoader(X, Y, batchsize=10, shuffle=true) 

NumEpochs = 500
if training_size_option==3 NumEpochs = convert(Int64, NumEpochs/2.0) end
for epoch in 1:NumEpochs
    for (x, y) in train_loader
        Flux.train!(loss, Flux.params(m), train_loader, opt; cb=evalcb)
    end
end

And the error message:

Mutating arrays is not supported

Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] (::Zygote.var"#1048#1049")(::Nothing) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\lib\array.jl:61
 [3] (::Zygote.var"#2775#back#1050"{Zygote.var"#1048#1049"})(::Nothing) at C:\Users\username\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49
 [4] custom_layer at .\In[184]:63 [inlined]
 [5] (::typeof(∂(λ)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [6] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [7] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [8] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [9] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [10] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [11] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [12] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [13] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [14] Chain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:38 [inlined]
 [15] (::typeof(∂(λ)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [16] loss at .\In[184]:85 [inlined]
 [17] (::typeof(∂(loss)))(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [18] (::Zygote.var"#174#175"{typeof(∂(loss)),Tuple{Tuple{Nothing,Nothing}}})(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\lib\lib.jl:182
 [19] (::Zygote.var"#347#back#176"{Zygote.var"#174#175"{typeof(∂(loss)),Tuple{Tuple{Nothing,Nothing}}}})(::Float64) at C:\Users\username\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49
 [20] #17 at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:89 [inlined]
 [21] (::typeof(∂(λ)))(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [22] (::Zygote.var"#49#50"{Params,Zygote.Context,typeof(∂(λ))})(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:179
 [23] gradient(::Function, ::Params) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:55
 [24] macro expansion at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:88 [inlined]
 [25] macro expansion at C:\Users\username\.julia\packages\Juno\f8hj2\src\progress.jl:134 [inlined]
 [26] train!(::typeof(loss), ::Params, ::DataLoader, ::ADAM; cb::typeof(evalcb)) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:81
 [27] top-level scope at .\In[184]:92
2 Likes

Like it says, “Mutating arrays not supported” meaning both x_out[1] = 1 and map! aren’t allowed in code you ask it to differentiate.

However you can assemble the matrix you seem to be making without mutation, something like:

vcat(sin.(x_out[1,:])', cos.(x_out[2,:])', tan.(x_out[3,:])')
1 Like

@mcabbot, thank you for your feedback and the example.

After more reading, I think I understand the distinction between an assignment (where a variable points to) and a mutation (changing its values). A mutation will not change the objectid but an assignment will.

Yet, inside the custom layer, I can create new arrays in memory – with or without assignment: for example, these are all allowed:

  • deepcopy(x_out)
  • x_copy = deepcopy(x_out)
  • sin.(x_copy)
  • x_2 = sin.(x_copy)

The sin.(x_copy) in particular is okay (apparently) because it creates a new array but does not change one. Likewise, your example using vcat works (apparently) because it’s building an array, not changing one.

To generalize your vcat example for an arbitrary number of rows, I see on the REPL that the following works:

x_out_mod = vcat([x_out[i:i,:] for i in 1:size(x_out, 1)]...) 

To apply elements of an array f of functions to each row, we’d just do

x_out_mod = vcat([f[i].(x_out[i:i,:]) for i in 1:size(x_out, 1)]...) 

But even the first line doesn’t work inside the custom layer: I get a MethodError: no method matching iterate(::Nothing) which I can’t make heads or tails of. Replacing 1:size(x_out, 1) with 1:3 doesn’t help. Any ideas?

More conceptually, why aren’t mutations allowed? I thought I’d read that it has something to do with Flux not being able take gradients otherwise, but I don’t understand why it has no problem taking a gradient of a function broadcast to all elements and it would have a problem taking gradients of functions applied row-wise (which is my goal). Or why it’s okay to make new array but not to change an existing array (even one that is newly created and not tied to the struct).

2 Likes

Others may have a more precise summary, but roughly it’s because (1) the inputs to a function may be needed again in the backward pass, and if so would need to be copied anyway, and (2) it’s much easier to trace always results returned by functions, not side-effects. The chain rule is easy for f(g(h(x))), but not so easy if each of these also overwrites some elements of some z.

Internally, sin.(x) creates similar(x) and then mutates this. But since Zygote knows how to handle broadcasts, it doesn’t need to look inside at these details. If you supply the gradient for your function via @adjoint, then you are similarly free to do as you wish behind the scenes.

The error isn’t informative but I’m not surprised if Zygote can’t handle that. Indexing an array of functions is (from its point of view) a fairly exotic thing to be doing. You might be able to write it some other way, e.g. this isn’t quite right but it runs…

julia> fs = [sin,cos,tan];

julia> gradient(x -> sum(Core._apply.(fs, x)), rand(3))[1]
3-element Array{Tuple{Float64},1}:
 (0.826261562246469,)
 (-0.010135186129763765,)
 (1.4408074405285585,)
5 Likes

Thank you for your helpful response.

The message MethodError: no method matching iterate(::Nothing) isn’t actually related to iterating over the array of functions; I can hard code a particular function to use for each element of the array, and I get the same error.

Below is the error from declaring y4 = [sin.(xval) for (i, xval) in enumerate(x_out)] inside the custom layer. Stacktrace line [10] refers to the line in which y4 is assigned. Looking at the stacktrace, it does seem that Zygote assumes it be be an array of some partial differential type (its complaint about iterating suggests it’s of zero length?)

Maybe defining @adjoint would help make it compatible with Zygote, but I’m unfortunately out of my depth in figuring out how to use it at this point. If I were using the functions sin, cos, and tan with known derivatives cos, -sin, and 1/cos^2, how would I incorporate those into the array so that it is compatible with Zygote?

MethodError: no method matching iterate(::Nothing)
Closest candidates are:
  iterate(!Matched::Core.SimpleVector) at essentials.jl:603
  iterate(!Matched::Core.SimpleVector, !Matched::Any) at essentials.jl:603
  iterate(!Matched::ExponentialBackOff) at error.jl:253
  ...

Stacktrace:
 [1] _zip_iterate_some at .\iterators.jl:351 [inlined]
 [2] _zip_iterate_some at .\iterators.jl:353 [inlined]
 [3] _zip_iterate_all at .\iterators.jl:343 [inlined]
 [4] iterate at .\iterators.jl:333 [inlined]
 [5] iterate at .\generator.jl:44 [inlined]
 [6] collect(::Base.Generator{Base.Iterators.Zip{Tuple{Array{typeof(∂(#797)),2},Nothing}},Base.var"#3#4"{Zygote.var"#1187#1191"}}) at .\array.jl:665
 [7] map(::Function, ::Array{typeof(∂(#797)),2}, ::Nothing) at .\abstractarray.jl:2154
 [8] (::Zygote.var"#1186#1190"{Array{typeof(∂(#797)),2}})(::Nothing) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\lib\array.jl:172
 [9] (::Zygote.var"#1194#1195"{Zygote.var"#1186#1190"{Array{typeof(∂(#797)),2}}})(::Nothing) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\lib\array.jl:187
 [10] custom_layer at .\In[360]:106 [inlined]
 [11] (::typeof(∂(λ)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [12] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [13] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [14] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [15] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [16] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [17] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [18] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
 [19] (::typeof(∂(applychain)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [20] Chain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:38 [inlined]
 [21] (::typeof(∂(λ)))(::Array{Float64,2}) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [22] loss at .\In[360]:133 [inlined]
 [23] (::typeof(∂(loss)))(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [24] #174 at C:\Users\username\.julia\packages\Zygote\YeCEW\src\lib\lib.jl:182 [inlined]
 [25] #347#back at C:\Users\username\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49 [inlined]
 [26] #17 at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:89 [inlined]
 [27] (::typeof(∂(λ)))(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [28] (::Zygote.var"#49#50"{Params,Zygote.Context,typeof(∂(λ))})(::Float64) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:179
 [29] gradient(::Function, ::Params) at C:\Users\username\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:55
 [30] macro expansion at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:88 [inlined]
 [31] macro expansion at C:\Users\username\.julia\packages\Juno\f8hj2\src\progress.jl:134 [inlined]
 [32] train!(::typeof(loss), ::Params, ::Base.Iterators.Zip{Tuple{Array{LinearAlgebra.Adjoint{Float64,Base.ReshapedArray{Float64,2,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}},Tuple{}}},1},Array{LinearAlgebra.Adjoint{Float64,Array{Float64,2}},1}}}, ::ADAM; cb::typeof(evalcb)) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:81
 [33] top-level scope at .\In[360]:140

It’s probably irrelevant, but I’ve tried four approaches to applying a unique function to each row that work in the REPL but fail in the custom layer in an identical manner, including the original:

 y1 = vcat([fs[i].(x[i:i, :]) for i in 1:size(x_out,1)]...)

and the functionally equivalent:

y2 = vcat([map(x->fs[i](x), x[i:i, :]) for i in 1:size(x_out,1)]...)

and a version inspired by your use of Core._apply:

y3 = vcat([map(x->Core._apply.(fs[i], x), x[i:i, :]) for i in 1:size(x_out,1)]...)

The fourth omits the use of vcat:

y4 = [fs[mod(i-1,3)+1].(xval) for (i, xval) in enumerate(x_out)]

For testing in the REPL, we can define fs = [sin cos tan] and x_out = rand(3,4). And in each of the above, replacing fs[i]. with sin. still gives the error from within the custom layer.

To write adjoints you will need to read the manual, and probably look at some of these examples.

Zygote’s errors are usually from 10 layers underground, I don’t quite know why each of these fails.

But I have a package solution:

julia> using Tullio, ForwardDiff, Zygote

julia> fs = [sin,cos,tan];

julia> rowmap(fs, x) = @tullio y[r,c] := begin
         f = getindex(fs, r)
         @inbounds f(x[r,c])
       end grad=Dual;

julia> gradient(x -> sum(rowmap(fs, x)), ones(3,10))[1]
3×10 Array{Float64,2}:
  0.540302   0.540302   0.540302   0.540302  …   0.540302   0.540302   0.540302   0.540302
 -0.841471  -0.841471  -0.841471  -0.841471     -0.841471  -0.841471  -0.841471  -0.841471
  3.42552    3.42552    3.42552    3.42552       3.42552    3.42552    3.42552    3.42552

This is probably quite efficient, compared to making slices. If you check axes(fs,1)==axes(x,1) then you could use @inbounds getindex(fs, r).

4 Likes

That works!! Many, many thanks. I’m now able to use an array of functions in a custom layer. I’m going to work with it a bit and clean it up, then I’ll post the working code.

I also am very happy to learn about Tullio, which looks like it’ll a) let me do assignments the way I think about them (more of a Matlab / Igor Pro background), and b) do them more efficiently, naturally. Thanks again.

1 Like

Hi @mcabbott… looking at the Tullio package, I just noticed that you are also the author – nice work! :smiley:

Building on your rowmap example, I wrote a little function to un-normalize values in an array x; for each row, there is a single pre-normalization min and max value (corresponding, say, to the min/max of the target for that row/output); these are stored in a min_max array having two columns (for the min and max) and a number of rows equal to the number the number of rows in x.)

This works, as it stands. I would like to be able to uncomment the lines I’ve commented below in order to support optionally having a third column. Doing so results in an array of zeros – both when min_max has two and when it has three columns. If I try using this inside a layer function (as I did with rowmap) it works fine if the nonneg lines are commented, but when uncommented, in spite of my conditionals, using a 2-column min_max array with these portions enabled results in BoundsError: attempt to access 3×2 Array{Float64,2} at index [1, 3]. Do you know what might be going on and how I might get around it?

"Applies a row-dependent un-normalization to each element of x.
Inverse of rowwise_norm(); see notes for rowwise_norm.

WANT TO ALLOW an optional third column of min_max to be supplied, which
would be interpreted as a boolean telling us whether to clip negative values
to zero for the corresponding rows. But even with the code below, this
generates a BoundsError if min_max has only the standard 2 columns."
rowwise_unnorm(min_max, x) = @tullio y[r,c] := begin
        #nonneg = size(min_max,2)>2 ? min_max[r,3] > 0 : false
        offset = min_max[r,2]
        scale = min_max[r,1] - offset
        output_rc = @inbounds x[r,c]*scale + offset
        #nonneg ? output_rc : 0.
       end grad=Dual;

The test code below works when the nonneg lines are commented out. When uncommented, the output is full of zeros, regardless of whether min_max has two or three columns.

A = rand(3,6); 
min_max = [10. 20.; -100. 100.; 5. 6.]
min_max3 = [10. 20. 0.; -100. 100. 0.; 5. 6. 0.]
display(a)
display(rowwise_unnorm(min_max, A))
display(rowwise_unnorm(min_max3, A))
------------------------------------
# Input, A:
3×6 Array{Float64,2}:
 0.920267  0.124377  0.931977  0.780142  0.399      0.928706
 0.101141  0.122639  0.129025  0.980223  0.308906   0.34805
 0.33944   0.678434  0.622225  0.865139  0.0923755  0.961116

# Output when nonneg lines commented; it works
3×6 Array{Float64,2}:
 10.7973   18.7562   10.6802    12.1986   16.01     10.7129
 79.7717   75.4722   74.1951   -96.0446   38.2188   30.3899
  5.66056   5.32157   5.37778    5.13486   5.90762   5.03888

# Output when nonneg lines uncommented, 
# both when using min_max and min_max3.
3×6 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0

Thanks in advance for any pointers!

I don’t see the BoundsError, but I do see the zeros.

But isn’t zero expected here? When min_max has two columns, then nonneg == false, giving zero. When it has three, the third column never has a positive value, so again nonneg == false for all rows.

I would also be tempted to pull the size(min_max,2)>2 out of the loops, and write something more like this:

julia> function rowwise_unnorm2(m, x)
         if size(m,2)==2
           @tullio y[r,c] := x[r,c]*(m[r,1] - m[r,2]) + m[r,2]
         else
           @tullio y[r,c] := (x[r,c]*(m[r,1] - m[r,2]) + m[r,2]) * (m[r,3]>0)
         end
       end

julia> rowwise_unnorm2(min_max, A)
3×6 Array{Float64,2}:
  11.6186  13.9026    13.6731    11.8118    10.7491    16.6908
 -71.2031  41.4436   -66.862    -14.9735   -92.3468   -70.8891
   5.0927   5.36377    5.34413    5.54949    5.35215    5.93428

julia> min_max3 = [10. 20. +999; -100. 100. -999; 5. 6. +999]

julia> rowwise_unnorm2(min_max3, A)
3×6 Array{Float64,2}:
 11.6186  13.9026   13.6731   11.8118   10.7491   16.6908
 -0.0      0.0      -0.0      -0.0      -0.0      -0.0
  5.0927   5.36377   5.34413   5.54949   5.35215   5.93428

(And, thanks! I hope it’s useful.)

Okay, you’re right – the zeros should have been expected. I had a programming error. In the last line, instead of nonneg ? output_rc : 0. I should have had:

nonneg && output_rc < 0. ? 0. : output_rc

So if there are three columns and the third column specifies the output should be nonnegative for that row, then nonneg == true; if the output is also negative, then in the revised code that particular value is set to zero, as I had originally intended.

I didn’t like how I had the repeated evaluations, but I didn’t know what the syntax limitations were. Your version is much cleaner and more legible!

Incorporating your version with my code correction, we have

function rowwise_unnorm(m, x)
    if size(m,2)==2
        @tullio y[r,c] := x[r,c]*(m[r,1] - m[r,2]) + m[r,2] grad=Dual;
    else
        @tullio y[r,c] := begin
            val = x[r,c]*(m[r,1] - m[r,2]) + m[r,2]
            m[r,3]>0 && val < 0. ? 0. : val
            end grad=Dual;
    end
end

This works for two columns and for three:

julia> min_max = [-10. 10.; -100. 100.; 5. 6.]
julia> min_max3 = [-10. 10. 0.; -100. 100. 1.; 5. 6. 0.]

julia> display(rowwise_unnorm(min_max, A))
3×6 Array{Float64,2}:
 -8.40534   7.51246  -8.63955   -5.60284   2.02     -8.57412
 79.7717   75.4722   74.1951   -96.0446   38.2188   30.3899
  5.66056   5.32157   5.37778    5.13486   5.90762   5.03888

julia> display(rowwise_unnorm(min_max3, A))
3×6 Array{Float64,2}:
 -8.40534   7.51246  -8.63955  -5.60284   2.02     -8.57412
 79.7717   75.4722   74.1951    0.0      38.2188   30.3899
  5.66056   5.32157   5.37778   5.13486   5.90762   5.03888

# For the example with min_max3, note that because min_max3[2,3]==1.,
# nonnegativity is enforced for the second row, and the output at [2,4], 
# which was negative, is set to 0. And because min_max3[1,3]==0., 
# no nonnegativity is enforced for the first row and the negative values 
# are retained.

The corresponding normalization routine would be

rowwise_norm(m, x) = @tullio y[r,c] := begin
    (x[r,c] - m[r,2])/(m[r,1] - m[r,2])
    end grad=Dual;

Again, thank you!

1 Like

Edited response to include the corresponding rowwise_norm function and to add the missing grad=Dual statements. It seems to work, but please do let me know if I didn’t use the grad=Dual statements correctly.

I also failed to mention that I no longer get the BoundsError when calling from within a custom layer. :slight_smile: