Noise with gradient works!

Tarny_GG_Channie · May 14, 2024, 7:34am

The function was quite random. At the start of the world generation, you would fix the function by selecting the seed. This becomes a deterministic function. You can then draw a grid or any shape you want by calculating the function several times. The function was built to be smooth and differentiable, so no need to do things like generating a grid then interpolating to make it differentiable. Perlin noise and many noises that came after it are used in various ways in world generation. The main point was that one technique presented in the video required the gradient of the noise, so I was testing if autograd would work, and it did beautifully.

gdalle · May 14, 2024, 9:14am

And what I’m saying is that based on the code you provided, you should not expect it to work in general, because differentiating through noise is undefined behavior in the majority of cases

Tarny_GG_Channie · May 14, 2024, 9:25am

This is not some noises randomly concocted. Perlin noise and its later versions were designed to be differentiable (as a way to guarantee smoothness). It’s like this…

I don’t know exactly how autograd works but if you have something like…

f(x) = x+random(seed=floor(x))

Then my idea is that it should be differentiable because d/dx floor(x) = 0… therefore, even if the gradient of the random is undefined whatever, it should all be multiplied by zero anyway, leaving you with a very defined gradient.

But perhaps I’m wrong. I’m not used to automatic differentiation. I just have the idea that these kinds of functions should in principle be differentiable

gdalle · May 14, 2024, 9:36am

That may be the case, but as far as I can tell from the repo, CoherentNoise.jl doesn’t include any specific differentiation rules for either ChainRulesCore.jl, ForwardDiff.jl or Enzyme.jl. Nor does its README say anything about differentiability.
So my point remains: if you differentiate through a function that calls a random sampler, and you don’t use a tool which is built for that purpose (like StochasticAD.jl), you expose yourself to incorrect behavior.
Also related:

Challenges with Zygote Differentiation in Multivariate Mixture Models as input noise of an implicit machine learning method - #8 by gdalle

Mason · May 14, 2024, 9:57am

Again though, these coherent noise samplers are very different from regular samplers, and it doesn’t seem to need much derivative specialization. You can see visually here it is calculating faithful gradients:

using GLMakie, CoherentNoise, ForwardDiff, StaticArrays

let sampler = opensimplex_2d(seed=1)
    xs = ys = range(-2, 1, length=100)
    zs = map(Iterators.product(xs, ys)) do (x, y)
        sample(sampler, x, y)
    end
    step = 5
    grads = map(Iterators.product(xs[1:step:end], ys[1:step:end])) do (x, y)
        ForwardDiff.gradient(v -> sample(sampler, v[1], v[2]), SVector((x, y))) 
    end
    p = heatmap(xs, ys, zs)
    arrows!(xs[1:step:end], ys[1:step:end], map(x -> x[1]/20, grads), map(x -> x[2]/20, grads))
    p
end

This works even though I constructed the gradients after already doing a pass over the points because these samplers are perfectly correlated when re-evaluated at the same point, so they’re safe to use with ForwardDiff:

julia> let sampler = opensimplex_2d(seed=2)
           map(1:10) do _
               sample(sampler, 1, 1)
           end
       end
10-element Vector{Float64}:
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994
 -0.04005349072514994

StochasticAD.jl is not necessary here, becuase the program being differentiated is determinstic. There’s basically just a random input parameter to the program, but that parameter is constant throughout the execution and from run to run.

Tarny_GG_Channie · May 14, 2024, 10:11am

Uhh… I think I understood your point now.

There is indeed a concern that… for example… if you sample a point that is 1 with probability x… and call that f(x) implemented with rand_between_0_to_1()<x
E[f(x)] = x, d/dx E[f(x)] = 1, but f’(x) = 0 everywhere it is defined because an infinitesimal change in x won’t change whether or not the sampled value passes below x.

The derivative of expectation is indeed not always equal to the expectation of derivative.
However, that’s not what it is here. This is akin to having a seed and making the process deterministic by fixing the seed.

gdalle · May 14, 2024, 10:16am

Okay that was the missing ingredient for me, sorry about the misunderstanding. In this case I think you’re fine indeed

Topic		Replies	Views
Problems with reverse mode automatic diff (e.g. zygote) with NeuralSDEs Modelling & Simulations sde , zygote	2	152	June 15, 2024
Use ForwardDiff instead of Zygote with Flux? Machine Learning	10	1703	September 3, 2021
Need help using ForwardDiff gradient with numerical loop General Usage question	11	176	November 7, 2024
Is Julia even the right choice for this contrarian game idea long-term? General Usage	11	605	November 28, 2024
Zygote Warning within FluxOptTools - 'cannot track gradients' Machine Learning flux , zygote	2	570	March 27, 2023

Noise with gradient works!

Related topics