Gradient of parameters that apply only to subsets of data

mihai · November 17, 2019, 4:07pm

Hi, I would like to optimize parameters of an analytic function to fit heterogeneous data (different experimental conditions, partly overlapping). I’m using index arrays to know which parameters to use for which data point. This seems problematic with Flux and Zygote, here’s a minimal example trying to get the parameter gradients. I didn’t know how to wrap the meta-information (last-index arrays, here id) so I put it in a nested function:

using Flux.Tracker  # or: Zygote

function pred(f,id)
    d = Vector{Float64}(undef,id[end])
    i1 = 1
    for (n,i2) in enumerate(id)
        d[i1:i2] .= f[n]    # dummy function depending on f[n]
        i1 = i2+1
    end
    return d
end

function grad(f0,id,d)
    loss(f) = sum(abs2.(d .- pred(f,id)))
    gradient(loss,f0)
end

id = [2,3,5]        # indices defining where parameters apply
f0 = [1.0,2.0,3.0]  # parameters
d  = pred(f0,id)    # test data
grad(f0,id,d)

In Flux.Tracker, this gives me

ERROR: LoadError: MethodError: no method matching Float64(::Tracker.TrackedReal{Float64})

While Zygote complains about

ERROR: LoadError: Mutating arrays is not supported

Any (other) idea how to optimize “partly-global” parameters? Or what I’m doing wrong?

findmyway · November 18, 2019, 3:26am

In your pred function, you pre-allocate the d and then mutate it. Instead you can just apply your f to id and then concat the results.

mihai · November 18, 2019, 8:34am

Aha, thanks for that… but I can’t figure out how to concat the results without initializing them in some form. What I tried:

function pred(f,id)
    d = Vector{Float64}(undef,id[end])          # version 1
#   d = Float64[]                               # version 2-4
    i1 = 1
    for (n,i2) in enumerate(id)
        d[i1:i2] .= f[n]                        # version 1
#       append!(d,repeat([f[n]],i2-i1+1))       # version 2
#       d = vcat(d,repeat([f[n]],i2-i1+1))      # version 3
#       for i=i1:i2; push!(d,f[n]); end         # version 4
        i1 = i2+1
    end
    return d
end

resulted in:

ERROR: Mutating arrays is not supported                      #version 1
ERROR: Can't differentiate gc_preserve_end expression        #version 2
ERROR: Mutating arrays is not supported                      #version 3
ERROR: Mutating arrays is not supported                      #version 4

maybe there’s a nice way with [i for i in (???)] using multiple indices?

mihai · November 18, 2019, 8:09pm

Sorry, don’t bother, I’ll switch to larger index arrays instead so I can access repeated parameters in one go without preallocation and complicated nested loops. It’s less memory efficient, but I hope autodifferentiation is worth it. Thanks again for the explanation!

Topic		Replies	Views
Zygote.gradient(): Mutating arrays is not supported General Usage	1	786	August 18, 2020
Flux.params of a matrix implemented as a struct Machine Learning zygote	11	979	May 17, 2021
Gradient of a loss function : struggling to avoid arrays mutation New to Julia zygote , sciml	4	1488	December 7, 2020
Flux Zygote Gradient: Understanding Mutating arrays is not supported Machine Learning	21	4131	December 3, 2020
Help with Zygote and parameters New to Julia zygote	6	1503	July 1, 2020

Gradient of parameters that apply only to subsets of data

Related topics