How does ForwardDiff.jl and FiniteDiff.jl handle edge-hitting cases?

Hi guys,

I am using ForwardDiff.jl to calculate the parameter uncertainty in my model (see below about the more detailed information…). I am wondering, how does ForwardDiff.jl and FiniteDiff.jl handle the edge-hitting parameters? i.e. what if the optimization finds one of the parameter is at the lower/upper bounds? and these bounds are defined in the model codes…

============
detailed background:
I am optimizing the parameters of a non-linear computer simulation of a physical process M(\theta, X), by attempting to find the simulational parameters \theta that make my simulation reproduce a set of summary metrics as similarly as possible to my observed values y_0. To do this, I am using the following cost function:

L(\theta_{optim}) = \min_{\theta} L(y_o, y_{sim}(\theta)) =\sum_{GPP, RECO,NEE,ET}\frac{(y_o-y_{sim}(\theta))^2}{var(y_o)} + \sum_{LAI,AGB}\frac{|y_o-y_{sim}(\theta)|}{1+mean(y_o)} + \sum_{NDVI}\frac{((y_o-mean(y_o))-(y_{sim}-mean(y_{sim}(\theta))))^2}{var(y_o)}

where GPP, RECO, NEE, ET, LAI, AGB, NDVI are all different observation variables (time series). The idea is to jointly optimise across these different metrics, but weight their relative impact based on the measurement noise of the observed variables y_0.

I intend to calculate the J matrix, where J is the Jacobian of the values of my simulated time series with regards to the underlying model parameters \theta, and would have the shape of (n_time_series_point, n_parameter) (Is this correct?). In other words, the matrix J would be the sensitivity of each data point in my observed timeseries data to the simulation parameters, at the optimal point in parameter space \theta_{optim}.

I understand I can then use the matrix J^TJ to calculate the Hessian matrix, take the inverse of this Hessian, and use the square of the diagonal elements of the resulting matrix as a quantitative estimate of the uncertainty in my parameters given my data; as this has some precendent in the literature.

First, ForwardDiff is an analytical derivative (computed by “forward-mode” AD), and FiniteDiff is approximate numerical derivative computed by finite differences. These are very different things!

Second, you need to be precise about what you mean by “edge-hitting” parameters. Suppose you have a differentiable function f(x), and you want to compute (or approximate) f'(x). By “edge-hitting”, I imagine you could mean one of two things.

  • (a) You are optimizing f(x) in some domain, say x \in [a,b], but the function is still defined outside that domain.
    • Then, both packages should work fine in computing f'(b) at the “edge”: ForwardDiff will compute the analytical derivative, using only computations at b, whereas FiniteDiff will look at points like b \pm \Delta x, so it may evaluate f slightly outside the domain… but so what?
  • (b) Your function f(x) is not defined outside the domain. In that case, the derivative f'(b) at the “edge” x=b is not even defined, since the conventional definition of the derivative requires the two-sided limit to exist. In this case:
    • ForwardDiff will probably return an answer corresponding to a one-sided limit (if it exists!), though it may pick the “wrong” side" — when there is a discontinuity in a function, it typically picks one side or the other depending on how the if statements are written.
    • FiniteDiff will default to a two-sided limit (via a centered difference) and fail or return nonsense, but you can pass ForwardMode or BackwardMode, which approximate one-sided limits from the right or left, respectively, assuming that’s the answer you want.