2 problems with Flux

Dearests

I am encountering some problems with Flux.
I am working on Julia 1.3.1
Zygote v0.4.6, Flux v0.10.1, CuArrays v1.6.0. No gpus for the time being.

The (not-so) minimal partially working example is the following:

using Flux,Zygote,CuArrays, LinearAlgebra

const q = 4
const N = 10

Z = rand(Float32,3*q,100)

vchain1=[Chain(Dense(q, q), softmax),
         Chain(Dense(2*q, q), softmax)]

vchain2 = [Chain(Dense(q, q, relu), Dense(q, q), softmax),
           Chain(Dense(2*q, 2*q, relu), Dense(2*q, q), softmax)]

log0(x::Number) = x > 0 ? log(x) : zero(x)
CuArrays.@cufunc log0(x::Number) = x > 0 ? log(x) : zero(x)

function loss(x,vmodel)
    logeta = Float32(0.0)
    @inbounds for site in 1:2
        thechunk = vmodel[site]
        idxcond = 1:site*q
        zcond = x[idxcond,:]
        idxsite = site*q + 1 : (site+1)*q
        xsite = x[idxsite,:]
        logeta += dot(log0.(thechunk(zcond)),xsite)
    end
    return -logeta
end

myloss1(x) = loss(x,vchain1)
myloss2(x) = loss(x,vchain2)

myl2_loss1(x) = loss(x,vchain1) + sum(norm,Flux.params(vchain1))
myl2_loss2(x) = loss(x,vchain2) + sum(norm,Flux.params(vchain2))

All four loss functions run smoothly:

julia> [lf(Z) for lf in (myloss1,myloss2,myl2_loss1,myl2_loss2)]
4-element Array{Float32,1}:
 585.86035
 556.96045
 589.7834
 566.8873

The computation of gradients is ok for the non l2 regularised loss functions

Flux.gradient(()->myloss1(Z),Flux.params(vchain1))
julia> ∇1=Flux.gradient(()->myloss1(Z),Flux.params(vchain1))
Grads(...)
julia> ∇2=Flux.gradient(()->myloss2(Z),Flux.params(vchain2))
Grads(...)

But the regularised counterparts throw an error.

julia> ∇1=Flux.gradient(()->myl2_loss1(Z),Flux.params(vchain1))
ERROR: Mutating arrays is not supported
....

I believe that this has to do with https://github.com/FluxML/Zygote.jl/issues/231
There are some hints, but I would like to know if there is a suggested solutions.

A second problem arises when I try to train to the non regularised network for which the computation of the gradient seems to work (for mytrain2 the same error is thrown).

Flux.train!(myloss1,Flux.params(vchain1),Z,ADAM(0.001))
ERROR: MethodError: no method matching getindex(::Float32, ::UnitRange{Int64}, ::Colon)
Closest candidates are:
  getindex(::Number) at number.jl:75
  getindex(::Number, ::Integer) at number.jl:77
  getindex(::Number, ::Integer...) at number.jl:82
...

Any ideas?

Hi Andrea, I agree, with this latest upgrade, they have essentially wrecked Flux. I got the same error as you, on code which worked perfectly in the previous 0.9.0 version. I fixed mine by downgrading (which fortunately Julia makes quite easy).
This is a short-run solution. Unless they fix Flux, I think the longer-term solution is to switch to PyTorch. Maybe they might even make a version for Julia?

Yes PyTorch is a solution I was considering.

I feel a bit undecided, because I recently started a project for which I was able to make Flux work (on gpu too), but when I increased the depth of the layers (as in the simplified example vchain1 → vchain2) stopped working.

Ideally, I would like to explore Flux (which I like very much) a little more, and eventually if no solution emerges, move the whole pipeline to PyTorch.

A

Hi Andea, Try downgrading Flux temporarily (this is easy - just add the earlier version 0.9.0). Then restart and all shoudl work. To go back, just add the later version and restart.

Anyway, as I said, for me downgrading worked great for fixing the issue highlighted above.

You could also try https://github.com/denizyuret/Knet.jl as an alternative until Flux settles. I’ve been using it for the past year for image classification and it works great! It’s also got tutorials, examples and good documentation.