Noise with gradient works!

The function was quite random. At the start of the world generation, you would fix the function by selecting the seed. This becomes a deterministic function. You can then draw a grid or any shape you want by calculating the function several times. The function was built to be smooth and differentiable, so no need to do things like generating a grid then interpolating to make it differentiable. Perlin noise and many noises that came after it are used in various ways in world generation. The main point was that one technique presented in the video required the gradient of the noise, so I was testing if autograd would work, and it did beautifully.

And what I’m saying is that based on the code you provided, you should not expect it to work in general, because differentiating through noise is undefined behavior in the majority of cases

This is not some noises randomly concocted. Perlin noise and its later versions were designed to be differentiable (as a way to guarantee smoothness). It’s like this…

I don’t know exactly how autograd works but if you have something like…

f(x) = x+random(seed=floor(x))

Then my idea is that it should be differentiable because d/dx floor(x) = 0… therefore, even if the gradient of the random is undefined whatever, it should all be multiplied by zero anyway, leaving you with a very defined gradient.

But perhaps I’m wrong. I’m not used to automatic differentiation. I just have the idea that these kinds of functions should in principle be differentiable

That may be the case, but as far as I can tell from the repo, CoherentNoise.jl doesn’t include any specific differentiation rules for either ChainRulesCore.jl, ForwardDiff.jl or Enzyme.jl. Nor does its README say anything about differentiability.
So my point remains: if you differentiate through a function that calls a random sampler, and you don’t use a tool which is built for that purpose (like StochasticAD.jl), you expose yourself to incorrect behavior.
Also related:

Again though, these coherent noise samplers are very different from regular samplers, and it doesn’t seem to need much derivative specialization. You can see visually here it is calculating faithful gradients:

using GLMakie, CoherentNoise, ForwardDiff, StaticArrays

let sampler = opensimplex_2d(seed=1)
    xs = ys = range(-2, 1, length=100)
    zs = map(Iterators.product(xs, ys)) do (x, y)
        sample(sampler, x, y)
    step = 5
    grads = map(Iterators.product(xs[1:step:end], ys[1:step:end])) do (x, y)
        ForwardDiff.gradient(v -> sample(sampler, v[1], v[2]), SVector((x, y))) 
    p = heatmap(xs, ys, zs)
    arrows!(xs[1:step:end], ys[1:step:end], map(x -> x[1]/20, grads), map(x -> x[2]/20, grads))


This works even though I constructed the gradients after already doing a pass over the points because these samplers are perfectly correlated when re-evaluated at the same point, so they’re safe to use with ForwardDiff:

julia> let sampler = opensimplex_2d(seed=2)
           map(1:10) do _
               sample(sampler, 1, 1)
10-element Vector{Float64}:

StochasticAD.jl is not necessary here, becuase the program being differentiated is determinstic. There’s basically just a random input parameter to the program, but that parameter is constant throughout the execution and from run to run.

1 Like

Uhh… I think I understood your point now.

There is indeed a concern that… for example… if you sample a point that is 1 with probability x… and call that f(x) implemented with rand_between_0_to_1()<x
E[f(x)] = x, d/dx E[f(x)] = 1, but f’(x) = 0 everywhere it is defined because an infinitesimal change in x won’t change whether or not the sampled value passes below x.

The derivative of expectation is indeed not always equal to the expectation of derivative.
However, that’s not what it is here. This is akin to having a seed and making the process deterministic by fixing the seed.

Okay that was the missing ingredient for me, sorry about the misunderstanding. In this case I think you’re fine indeed