Flux.jl issues

I am a bit perplexed as to the issue here. I am trying to create a model

using Flux

critic_model1 = Chain(
    boardmove -> (reshape(boardmove[1], 4, 4, 1, 1), boardmove[2]),
    boardmove -> (Conv((2,2), 1=>64, relu)(boardmove[1]), boardmove[2]),
    boardmove -> (Conv((2,2), 64=>128, relu)(boardmove[1]), boardmove[2]),
    boardmove -> vcat(reshape(boardmove[1], 2*2*128), boardmove[2]),
    Dense(2*2*128+4, 256, relu),
    Dense(256, 1, exp)
)

loss_qa(boardmove, expected_value) = begin
    oldSA = critic_model1(boardmove)
    sum((oldSA .- expected_value).^2)
end


board = zeros(Float32, 4, 4)
opt = ADAM()
X = (board, rand(Float32, 4))
Y = Float32[10.0]

loss_qa(X, Y) # this works

Flux.train!(loss_qa, p, [(X, Y)], opt) # this doesn't

This feels like a relatively straightforward network to me.

But it’s failing on Flux v0.10 and v0.11 with this error. And I can’t sense of the this at all. I don’t have any Int64 types in my data. So this is quite confusing.

ERROR: Need an adjoint for constructor Pair{Int64,Int64}. Gradient is of type Tuple{Float32,Float32}
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] (::Zygote.Jnew{Pair{Int64,Int64},Nothing,false})(::Tuple{Float32,Float32}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\lib\lib.jl:288
 [3] (::Zygote.var"#1766#back#194"{Zygote.Jnew{Pair{Int64,Int64},Nothing,false}})(::Tuple{Float32,Float32}) at C:\Users\RTX2080\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49
 [4] Pair at .\pair.jl:12 [inlined]
 [5] Pair at .\pair.jl:15 [inlined]
 [6] (::typeof(∂(Pair)))(::Tuple{Float32,Float32}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0     
 [7] #45 at .\REPL[35]:4 [inlined]
 [8] (::typeof(∂(#45)))(::Tuple{Array{Float32,4},Array{Float32,1}}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0
 [9] applychain at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\layers\basic.jl:36 [inlined]
 [10] (::typeof(∂(applychain)))(::Array{Float32,1}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0    
 [11] applychain at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\layers\basic.jl:36 [inlined]
 [12] (::typeof(∂(applychain)))(::Array{Float32,1}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0
 [13] applychain at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\layers\basic.jl:36 [inlined]
 [14] (::typeof(∂(applychain)))(::Array{Float32,1}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0    
 [15] Chain at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\layers\basic.jl:38 [inlined]
 [16] (::typeof(∂(λ)))(::Array{Float32,1}) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0
 [17] loss_qa at .\REPL[37]:2 [inlined]
 [18] (::typeof(∂(loss_qa)))(::Float32) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0
 [19] #179 at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\lib\lib.jl:178 [inlined]
 [20] #1732#back at C:\Users\RTX2080\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49 [inlined]
 [21] #15 at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\optimise\train.jl:83 [inlined]
 [22] (::typeof(∂(λ)))(::Float32) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface2.jl:0
 [23] (::Zygote.var"#54#55"{Zygote.Params,Zygote.Context,typeof(∂(λ))})(::Float32) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface.jl:177
 [24] gradient(::Function, ::Zygote.Params) at C:\Users\RTX2080\.julia\packages\Zygote\iFibI\src\compiler\interface.jl:54
 [25] macro expansion at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\optimise\train.jl:82 [inlined]
 [26] macro expansion at C:\Users\RTX2080\.julia\packages\Juno\tLMZd\src\progress.jl:134 [inlined]
 [27] train!(::Function, ::Zygote.Params, ::Array{Tuple{Tuple{Array{Float32,2},Array{Float32,1}},Array{Float32,1}},1}, ::ADAM; cb::Flux.Optimise.var"#16#22") at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\optimise\train.jl:80
 [28] train!(::Function, ::Zygote.Params, ::Array{Tuple{Tuple{Array{Float32,2},Array{Float32,1}},Array{Float32,1}},1}, ::ADAM) at C:\Users\RTX2080\.julia\packages\Flux\IjMZL\src\optimise\train.jl:78
 [29] top-level scope at REPL[45]:1
 [30] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1088
1 Like

You’re creating new convolution layers in the closures every time you call your model. I believe the error you’re seeing is Zygote is trying to differentiate through the constructors; move the Conv(...) calls outside of the closures.

Try replacing the layer

boardmove -> (Conv((2,2), 1=>64, relu)(boardmove[1]), boardmove[2])

with

c1 = Conv((2,2), 1=>64, relu) # first make the conv layer

boardmove -> (c1(boardmove[1]), boardmove[2]) # then wrap it in a closure

and similarly for the other layer.

2 Likes