Challenges with Zygote Differentiation in Multivariate Mixture Models as input noise of an implicit machine learning method

Okay, so typically it’s easier to help you if you isolate a Minimum Working Example (MWE) first. Apparently this is the crux of your issue:

julia> using Zygote, Distributions

julia> f(μ) = rand(MvNormal(μ, I))
f (generic function with 1 method)

julia> g(μ) = rand(MixtureModel([MvNormal(μ, I), MvNormal(μ, I)]))
g (generic function with 1 method)

julia> f([1])
1-element Vector{Float64}:
 0.9848461195702295

julia> g([1])
1-element Vector{Float64}:
 1.216579611318481

julia> Zygote.jacobian(f, [1])
([0.0;;],)

julia> Zygote.jacobian(g, [1])
ERROR: Mutating arrays is not supported -- called setindex!(Vector{Float64}, ...)
This error occurs when you ask Zygote to differentiate operations that change
the elements of arrays in place (e.g. setting values with x .= ...)

Possible fixes:
- avoid mutating operations (preferred)
- or read the documentation and solutions for this error
  https://fluxml.ai/Zygote.jl/latest/limitations

First of all, this shows us that differentiating through random number generation is not well-defined behavior (see the zero jacobian), so I think it’s worth exploring why you feel the need to do that. There might be better alternatives like the reparametrization trick.

But setting that aside for a moment, it seems that sampling from the mixture model leads to mutating operations, while sampling from the distribution alone doesn’t. I’m honestly not sure why.

1 Like