Gradient of sum

e3c6 · February 16, 2021, 11:09am

T = randn(10,10)
ps = Flux.params(T)
opt=ADAM()
for iter = 1:100
    gs = gradient(ps) do
        sum(T)
    end
    Flux.update!(opt, ps, gs)
end

throws an error:

ArgumentError: Cannot setindex! to 0.0009999999900000003 for an AbstractFill with value 1.0.

Stacktrace:
 [1] setindex! at /home/cossio/.julia/packages/FillArrays/tE9Xq/src/FillArrays.jl:41 [inlined]
 [2] _setindex! at ./abstractarray.jl:1176 [inlined]
 [3] setindex! at ./abstractarray.jl:1153 [inlined]
 [4] macro expansion at ./broadcast.jl:932 [inlined]
 [5] macro expansion at ./simdloop.jl:77 [inlined]
 [6] copyto! at ./broadcast.jl:931 [inlined]
 [7] copyto! at ./broadcast.jl:886 [inlined]
 [8] materialize! at ./broadcast.jl:848 [inlined]
 [9] materialize!(::FillArrays.Fill{Float64,2,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}}, ::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(*),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(/),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(/),Tuple{Array{Float64,2},Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Nothing,typeof(-),Tuple{Int64,Float64}}}},Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(+),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(sqrt),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2},Nothing,typeof(/),Tuple{Array{Float64,2},Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Nothing,typeof(-),Tuple{Int64,Float64}}}}}},Float64}}}},Float64}}) at ./broadcast.jl:845
 [10] apply!(::ADAM, ::Array{Float64,2}, ::FillArrays.Fill{Float64,2,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}}) at /home/cossio/.julia/packages/Flux/05b38/src/optimise/optimisers.jl:177
 [11] update!(::ADAM, ::Array{Float64,2}, ::FillArrays.Fill{Float64,2,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}}}) at /home/cossio/.julia/packages/Flux/05b38/src/optimise/train.jl:23
 [12] update!(::ADAM, ::Params, ::Zygote.Grads) at /home/cossio/.julia/packages/Flux/05b38/src/optimise/train.jl:29
 [13] top-level scope at In[89]:8
 [14] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091

What is going on here?

e3c6 · February 16, 2021, 11:34am

https://github.com/FluxML/Flux.jl/issues/1510

simeonschaub · February 16, 2021, 11:56am

The issue here is that Flux.apply! assumes gradients to be mutable, but Zygote returned a FillArrays.Fill here, since that can be a lot more efficient in some cases. The easiest solution would be to pass Array.(gs) to Flux.update! instead of gs.

e3c6 · February 16, 2021, 11:58am

It would be nice to solve this within Flux instead of at the user’s code. Maybe Flux.apply!(...) should dispatch on the type of gs, and exploit the mutability of gs when appropriate.

simeonschaub · February 16, 2021, 12:06pm

It’s not quite as straightforward, since the current optimizer API relies heavily on IdDicts, so it might cause wrong results for stateful optimizers with immutable types. The proper fix here is a new optimizer API and then use the explicit instead of the implicit differentiation API of Zygote. See also https://github.com/FluxML/Flux.jl/pull/1481, but there might be more to be done for this to work seamlessly.

Topic		Replies	Views
Gradient of gradient Machine Learning	9	1264	November 6, 2020
Flux.params of a matrix implemented as a struct Machine Learning zygote	11	979	May 17, 2021
Gradient of parameters that apply only to subsets of data Machine Learning	3	739	November 18, 2019
Need some help in understanding zygote gradient Machine Learning	2	413	September 7, 2022
Zygote.gradient(): Mutating arrays is not supported General Usage	1	786	August 18, 2020

Gradient of sum

Related topics