Namaste,
I am trying to implement CTC loss in Flux/Zygote. And running into Error: Mutating arrays is not supported. So I cooked up the toy example below to gain better understanding.
Given an input x, I perform some sort of “triangualtion" operation on it, where another matrix a is built based on x in a complicated way. The loss is the very last element of a. When I try to differentiate the loss, the first two implementations throw an error.
triangulate1 errors because (I am guessing) of setindex! operations?
triangulate3 works because of matrix operations, and no in-place editing of a (the vector)?
Why triangulate2 fails is beyond me… Can you please explain.
What is the best practice when it comes to complicated losses like this that depend on a complex function of the input?
Also, is it explained somewhere what ‘mutating arrays’ means and why it does not work?
FYI: I wrote a python implementation of CTC in Theano, and AD just worked, I did not have to manually calculate the gradients. I am hoping I can pull that off in Julia too. ![]()
Thank you.
Cash
using Zygote
using LinearAlgebra
function triangulate1(x::Matrix{T}) where T
a = 0*x
a[1, 1] = x[1, 1]
for i in 2:size(x, 2)
for j in 1:i
a[j, i] = sum(a[1:j, i-1]) * x[j, i]
end
end
a
end
function triangulate2(x::Matrix{T}) where T
m, n = size(x)
a = x[:,1] .* [1; zeros(T, m-1)]
for i in 2:n
a = accumulate(+, a) .* [x[1:i, i]; zeros(T, m-i)]
end
a
end
function triangulate3(x::Matrix{T}) where T
m, n = size(x)
a = x[:,1] .* [1; zeros(T, m-1)]
A = LowerTriangular(fill(1, m, m))
for i in 2:n
a = A*a .* [x[1:i, i]; zeros(T, m-i)]
end
a
end
x1 = (1:4).*(1:4)'
# 4×4 Array{Int64,2}:
# 1 2 3 4
# 2 4 6 8
# 3 6 9 12
# 4 8 12 16
triangulate1(x1)
# 4×4 Array{Int64,2}:
# 1 2 6 24
# 0 4 36 336
# 0 0 54 1152
# 0 0 0 1536
triangulate2(x1)
# 4-element Array{Int64,1}:
# 24
# 336
# 1152
# 1536
triangulate3(x1)
# 4-element Array{Int64,1}:
# 24
# 336
# 1152
# 1536
gradient(x->triangulate1(x)[end], x1)[1]
# ERROR: Mutating arrays is not supported
gradient(x->triangulate2(x)[end], x1)[1]
# ERROR: Mutating arrays is not supported
# (This takes much longer time to error out)
gradient(x->triangulate3(x)[end], x1)[1]
# 4×4 Array{Int64,2}:
# 1536 288 32 0
# 0 240 96 0
# 0 0 96 0
# 0 0 0 96