Namaste,
I am trying to implement CTC loss in Flux/Zygote. And running into Error: Mutating arrays is not supported
. So I cooked up the toy example below to gain better understanding.
Given an input x
, I perform some sort of “triangualtion" operation on it, where another matrix a
is built based on x
in a complicated way. The loss is the very last element of a
. When I try to differentiate the loss, the first two implementations throw an error.
triangulate1
errors because (I am guessing) of setindex!
operations?
triangulate3
works because of matrix operations, and no in-place editing of a
(the vector)?
Why triangulate2
fails is beyond me… Can you please explain.
What is the best practice when it comes to complicated losses like this that depend on a complex function of the input?
Also, is it explained somewhere what ‘mutating arrays’ means and why it does not work?
FYI: I wrote a python implementation of CTC in Theano, and AD just worked, I did not have to manually calculate the gradients. I am hoping I can pull that off in Julia too.
Thank you.
Cash
using Zygote
using LinearAlgebra
function triangulate1(x::Matrix{T}) where T
a = 0*x
a[1, 1] = x[1, 1]
for i in 2:size(x, 2)
for j in 1:i
a[j, i] = sum(a[1:j, i-1]) * x[j, i]
end
end
a
end
function triangulate2(x::Matrix{T}) where T
m, n = size(x)
a = x[:,1] .* [1; zeros(T, m-1)]
for i in 2:n
a = accumulate(+, a) .* [x[1:i, i]; zeros(T, m-i)]
end
a
end
function triangulate3(x::Matrix{T}) where T
m, n = size(x)
a = x[:,1] .* [1; zeros(T, m-1)]
A = LowerTriangular(fill(1, m, m))
for i in 2:n
a = A*a .* [x[1:i, i]; zeros(T, m-i)]
end
a
end
x1 = (1:4).*(1:4)'
# 4×4 Array{Int64,2}:
# 1 2 3 4
# 2 4 6 8
# 3 6 9 12
# 4 8 12 16
triangulate1(x1)
# 4×4 Array{Int64,2}:
# 1 2 6 24
# 0 4 36 336
# 0 0 54 1152
# 0 0 0 1536
triangulate2(x1)
# 4-element Array{Int64,1}:
# 24
# 336
# 1152
# 1536
triangulate3(x1)
# 4-element Array{Int64,1}:
# 24
# 336
# 1152
# 1536
gradient(x->triangulate1(x)[end], x1)[1]
# ERROR: Mutating arrays is not supported
gradient(x->triangulate2(x)[end], x1)[1]
# ERROR: Mutating arrays is not supported
# (This takes much longer time to error out)
gradient(x->triangulate3(x)[end], x1)[1]
# 4×4 Array{Int64,2}:
# 1536 288 32 0
# 0 240 96 0
# 0 0 96 0
# 0 0 0 96