In this video, it was explained why the gradient of noise was needed.

In most programming languages, you’re talking about either writing your own noise to add the gradient feature or diving deep into the library to make it work.

Not here!

``````using CoherentNoise
using ForwardDiff
using StaticArrays
sampler = opensimplex2_2d(seed=1)

arr = @SVector [x,y]
f = a-> @inbounds sample(sampler,a[1],a[2])

end

``````

Unfortunately, there is no luck with Enzyme or Zygote. Maybe someone can explain why.
Edit: ReverseDiff works too.

3 Likes

Are we sure this snippet is take the same gradient as needed for whatever the video is talking about? No crash != it’s doing the right thing I guess

1 Like

I tried forwardDiff and BackwardDiff and they got the same result. Approximate gradient using finite difference gives roughly the same result too.

hmm I quickly watched the video – IIUC it’s not actually taking the gradient of the sampling function itself, the gradient is taking on the “landscape”, which starts out as a random 2D matrix (value = lanscape height).

So really what’s working is that Forward and ReverseDiff packages knows to “skip” diffing into the `sample(::sampler, ...)` call, rather than crashing?

1 Like

The landscape was likely generated by sampling lots of points in this fashion. One sampled point is corresponding to each point in the matrix. In real use cases, you would need to sample a grid. This was just a proof of concept that it works.

right, and I’m saying the video talks about taking gradient of the landscape (i.e. height change with respect to `x,y` coordinate changes), no point in this process you need to take gradient “through” the `sample()` process

1 Like

The sample function takes in the noise and the coordinates and return the value of the noise (which maps to height in this case). You need to take the gradient of height WRT x and y, which is exactly what these are doing. The real deal is a bit more complicated than this, but given that you could take the gradient through the noise function, it’s probably not too hard.

1 Like

What’s the Enzyme code that didn’t work and corresponding error message?

Reverse mode:

``````function sample_gradient_enzyme(sampler,x,y)
arr = @SVector [x,y]
f = a-> @inbounds sample(sampler,a[1],a[2])
end

end
``````

This prints out a wrong gradient, [0.0,0.0].

Forward mode:

``````function sample_gradient_enzyme(sampler,x,y)
arr = @SVector [x,y]
f = a-> @inbounds sample(sampler,a[1],a[2])
end

``````
``````ERROR: MethodError: no method matching BatchDuplicated(::SVector{2, Float64}, ::Tuple{MVector{2, Float64}, MVector{2, Float64}})
Closest candidates are:
BatchDuplicated(::T, ::Tuple{Vararg{T, N}}) where {T, N} at C:\Users\User\.julia\packages\EnzymeCore\5yOUk\src\EnzymeCore.jl:85
``````

Are you an Enzyme maintainer or something along that line?

Can you post a full log and version of packages you’re using.

FWIW the reverse mode Enzyme one works for me.

``````julia> println(sample_gradient_enzyme(sampler,0.87,0.88))
[0.07474945410688591, 0.017113162351589075]
``````

I can reproduce the SVector / MVector forward mode mismatch though.

Are you an Enzyme maintainer or something along that line?

I dabble from time to time.

5 Likes

I still use Julia version 1.8.4 if it matters. (I don’t really use new features yet but I still use Loopvectorization so I stuck to the version where it still worked.)
I used Enzyme version 0.10.18 (it installed this version for me).
The Reverse mode ran without any error, but produced [0.0, 0.0] gradient. The CoherentNoise is of version 1.6.6.

I updated StaticArrays to version 1.9.3 but the issue still persisted.

Update: After updating to 0.12.5 (and somehow also updating most of my other packages in the process), it works.

The forward mode sarray gradient error in Enzyme should be resolved by this PR: Fix static arrays on forward mode gradient call by wsmoses · Pull Request #1438 · EnzymeAD/Enzyme.jl · GitHub

2 Likes

Nice! I never thought I’d get to contribute some test case to Enzyme. I was just playing with gradient because the video said that you need the gradient of a noise and I decided to play with it a bit.

The video is long so I didn’t watch but if @jling is right then your problem is deeper than a specific autodiff package. Differentiating a function with stochastic output is not what these packages are made for, and it could reasonably be classified as undefined behavior. Autodiff of stochastic computational graphs is its own research field and has its own libraries (like `storchastic`), but I suspect that is not what you need here, so your function might have been ill-specified?

This function is quite complicated. The function involved, noise, have some stochasticity in its computation. However, the function is cleverly designed to make it differentiable everywhere. The first-order gradient of the function is very well-defined

Let me elaborate further then. The noise functions typically work by having a grid. The stochastic process is simply in determining which subsection you’re in. Once you’ve determined which section of the grid you’re in, the value of the noise in the position is determined by points. The position of the point within the block you’re in is definitely differentiable WRT the global position almost everywhere inside the same block. The function is cleverly designed so that the function is differentiable WRT the position within the block, and that the influence of each corner continuously drops down to zero as the position of the point gets further away from the corner, dropping to zero as it approaches the corner where going further would mean the corner would have no way of influencing the value of the noise (because the point has moved to a different block in the grid). This ensures that the function is continuous and differentiable everywhere. I’d assume that means that if the autograd can differentiate a piecewise differentiable function, it can indeed differentiate this. Some versions of the noises have it designed so that even the gradient of the function is continuous everywhere.

1 Like

Oops… I was perhaps confusing stochastic with discrete. The noise process is entirely deterministic.

yeah I didn’t continue since I think that’s unrelated tangent. the video is NOT about GitHub - gaurav-arya/StochasticAD.jl: Research package for automatic differentiation of programs containing discrete randomness..

Help me with how to describe it because all the way I can think of to describe it involve already knowing the difference… but I was trying to make the point that the gradient propagation doesn’t have to pass the `sample()`, i.e. reparametrization trick ( mathematical statistics - How does the reparameterization trick for VAEs work and why is it important? - Cross Validated) works in the trivial sense that it’s never part of the gradient flow path? It’s also possible I’m just mixing things up.

1 Like

The program output is pseudo-random WRT to seed, but is continuous and differentiable WRT the coordinate.

So in a way you sample a random function, and then you fix that function and differentiate with respect to it? I feel like this might be represented better by having the noise drawn first (a discrete, grid-like object probably), and then the function constructed on top by interpolation