ForwardDiff.jl returns NaNs for calculating gradient of simulation model


this is a follow up post to Differential evolution MCMC in Julia?

I want to try HMC/Nuts (via DynamicHMC.jl or AdvancedHMC.jl) because the algorithms may be more efficient than Differential Evolution MCMC.

I tried to use ForwardDiff.jl to calculate the gradient of my model (change in parameter values → change in log density). I have 19 parameters at the moment. For a subset of my model this is working (I excluded one process). However, for the full my model I get NaNs for the gradient with ForwardDiff.jl. With FiniteDifferences.jl it is working but slow (12 seconds for one gradient evaluation).

I already turned on the NaN safe option of ForwardDiff.jl: set_preferences!(ForwardDiff, "nansafe_mode" => true). For me it is not clear how to debug the resulting NaNs from ForwardDiff.jl.

I checked: Diagnosing NaNs from ForwardDiff
and saw that ForwardDiff.partials(x) turned to NaN at some point for some variables:

  Wsc = W > WHC ? 1.0 : W > PWP ? (W - PWP) / (WHC - PWP) : 0.0
  @. Wp = biomass_density_factor /
          (1 + exp(β_pet * u"d/mm" * (PET - α_pet)) / (1/(1-Wsc) - 1))
  @. W_sla = 1 - δ_sla + δ_sla / (1 + exp(-β_sla * (Wp - H_sla)))
  @. W_rsa = 1 - δ_wrsa + (K_wrsa + δ_wrsa - 1) / (1 + exp(-β_rsa * (Wp - H_rsa)))
  @. Waterred = W_sla * W_rsa

I think that the partials of the variable Wp was one of the first that turned to NaN.

I also noticed that the VSCode debugger is not working with ForwardDiff.jl (the execution should stop at a breakpoint when it is started with @run).

I use Unitful.jl (only within the model, the parameters and the logdensity do not have units) and DimensionalData.jl in my model. I guess that this is not a problem?

Here is the full program (I wasn’t able to reproduce the NaNs with a smaller program):

import Pkg

Pkg.add(url = "")
Pkg.add(["ForwardDiff", "TransformedLogDensities", "LogDensityProblems", "TransformVariables", "FiniteDifferences", "LogDensityProblemsAD"])

import GrasslandTraitSim as sim
import GrasslandTraitSim.Valid as valid
import GrasslandTraitSim.Vis as vis

using ForwardDiff
using TransformedLogDensities, LogDensityProblems
using TransformVariables
using FiniteDifferences
using LogDensityProblemsAD

trait_input = vis.load_trait_data(valid)
input_objs = valid.validation_input_plots(;
    plotIDs = ["HEG01"],
    nspecies = length(trait_input.sla));
valid_data = valid.get_validation_data_plots(;
    plotIDs = ["HEG01"]);
model_parameter = valid.model_parameters(; );
inf_p = (; zip(Symbol.(model_parameter.names),;

function my_loglik(inf_p)
    loglik = valid.loglikelihood_model(
        sim; input_objs, valid_data, plotID = "HEG01", inf_p,

    return loglik

struct V

function (problem::V)(θ)
    lprior = valid.prior_logpdf(model_parameter, collect(θ))

    if isinf(lprior)
        return -Inf

    return lprior + my_loglik(θ)

problem = V()
ℓ = TransformedLogDensity(model_parameter.t, problem)
∇ℓ = ADgradient(:FiniteDifferences, ℓ)
∇l = ADgradient(:ForwardDiff, ℓ)

x_test = TransformVariables.inverse(model_parameter.t, inf_p)
@time LogDensityProblems.logdensity(ℓ, x_test)
@time LogDensityProblems.logdensity_and_gradient(∇ℓ, x_test)
@time LogDensityProblems.logdensity_and_gradient(∇l, x_test)

Sorry, for this not so trivial problem! I am not an expert in automatic differentiation.

Are you parameters “constrained”, e.g. on the domain (0, Inf)? Because when working with samplers using gradient information, you need to make sure you’re working in “unconstrained” space so that the sampler don’t accidentally step outside of the valid parameter space.

The parameters are constrained. I used TransformVariables.jl and TransformedLogDensities.jl to transform them to (-Inf, Inf). So I think that this isn’t the problem.

I simplified this lines:

Wp = biomass_density_factor /
          (1 + exp(β_pet * u"d/mm" * (PET - α_pet)) / (1/(1-Wsc) - 1))


min(biomass_density_factor * Wsc * exp(-β_pet *  u"d/mm" * (PET - α_pet), 3.0)

and now it is working! I think that I wrote a too complex equation for a simple problem. Thanks @torfjelde!