# Changing the binding of a variable in a closure

I am labeling this question as “statistics” because the motivation comes from a statistical application. The tl: dr version is that I want to change the binding of a variable in a closure within a loop. The function itself is defined outside the loop. I don’t have a lot of flexibility in defining the function because I am passing it to derivative evaluation functions in the `ForwardDiff` package and those must be unary functions. Right now my approach is to declare the variable’s name as global before I define the function. I’m wondering if there is a more aesthetically pleasing approach.

So the background is that a nonlinear regression model function depends upon parameters and observed data. Generally the parameters are a relatively short vector and the data are columns in a table, like a `DataFrame`. One of the simple (but real) examples in Bates and Watts (Wiley, 1988), Nonlinear Regression Analysis and Its Applications is data from an enzyme kinetics experiment

``````julia> data = DataFrame(conc = repeat([0.02,0.06,0.11,0.22,0.56,1.1], inner=2),
rate = float([76,47,97,107,123,139,159,152,191,201,207,200]))
12×2 DataFrame
│ Row │ conc    │ rate    │
│     │ Float64 │ Float64 │
├─────┼─────────┼─────────┤
│ 1   │ 0.02    │ 76.0    │
│ 2   │ 0.02    │ 47.0    │
│ 3   │ 0.06    │ 97.0    │
│ 4   │ 0.06    │ 107.0   │
│ 5   │ 0.11    │ 123.0   │
│ 6   │ 0.11    │ 139.0   │
│ 7   │ 0.22    │ 159.0   │
│ 8   │ 0.22    │ 152.0   │
│ 9   │ 0.56    │ 191.0   │
│ 10  │ 0.56    │ 201.0   │
│ 11  │ 1.1     │ 207.0   │
│ 12  │ 1.1     │ 200.0   │
``````

for which the Michaelis-Menten model for reaction rate as a function of substrate concentration would be appropriate.

``````julia> function MicMen(pars, data)
Vm = pars
K = pars
conc = data.conc
@. Vm * conc / (K + conc)
end
MicMen (generic function with 1 method)
``````

For these data, values `Vm = 200.` and `K = 0.05` give the predicted response

``````julia> show(MicMen([200., 0.05], data))
[57.14285714285714, 57.14285714285714, 109.0909090909091, 109.0909090909091, 137.5, 137.5, 162.96296296296296, 162.96296296296296, 183.60655737704917, 183.60655737704917, 191.30434782608697, 191.30434782608697]
``````

I want to use methods from `ForwardDiff` to evaluate the Jacobian matrix but the function argument to `ForwardDiff.jacobian!` must be a function of `pars` only. I can use a closure and generate

``````f(x) = MicMen(x, data)
``````

in a context where data is defined so that is fine.

However, I also want to do this for several related groups of responses, say saved in a `GroupedDataFrame`, where the value of `pars` will change between groups. This is for fitting nonlinear mixed-effects models, described in Pinheiro and Bates (Springer, 2000), Mixed-Effects Models in S and S-PLUS. As I loop over the groups I need the binding of `data` in the closure to change to each SubDataFrame. I am currently doing this by assigning a global variable `_sdf` before defining `f` and changing the global variable’s binding in the loop.

Are there better approaches? I feel that I should be able to do something with `let` but I can’t seem to work out how to do so.

Put the variable in some sort of container (like a `Ref`) and update that instead?

2 Likes

When in doubt, forget that closures exist and do it by hand. E.g.

``````julia> struct F{T}
conc::T
end

julia> function (f::F)(par)
VM=pars
K=pars(2)
conc = f.conc
@. Vm * conc / (K + conc)
end
``````

Now, each instance of `F`, like e.g. `F(data.conc)`, is a valid input to forward diff. This is how closures are internally implemented.

Changing bindings in closures is supported, but almost always a bad idea. Instead, construct a new closure.

edit: Or do what @kristoffer.carlsson said, depending on what you actually want (I don’t know whether I understood what your issue is).

2 Likes

In case someone reading this later needs more detail, I created the `Ref` from the first `SubDataFrame`, then define the function for `ForwardDiff.jacobian!` dereferencing the `Ref`. If `data` is the `GroupedDataFrame` it looks like

``````Rdf = Ref(first(data))
f(pars) = MicMen(pars, Rdf[])
``````

Then the loop looks like

``````for (res, sdf) in zip(results, data)
Rdf[] = sdf
... manipulate the pars to get the appropriate value
ForwardDiff.jacobian!(res, f, pars, cfg)
end
``````

where `cfg` is a (common) `JacobianConfig` and `results` is a preallocated vector of the form

``````results = [DiffResults.JacobianResult(sdf.age, pars) for sdf in data]
``````

Is it the same `pars` for all subdataframes? It is unclear what the role of `pars` is, can you provide an MWE for the complete calculation?

The values of `pars` change from group to group. The `groups` of observations correspond to experimental units of some sort in the experiment. In a pharmacokinetics study there will be several participants each of whom receives dosages of a drug following which several measurements of serum concentrations over time are made.

tl; dr As an example, in R, `datasets::Theoph` is data from such a study. A compartment model for concentration as a function of time depends upon three scalar parameters for each subject: `ka`, the absorption rate constant, `k`, the elimination rate constant and `V`, the effective volume of distribution, as

``````(dose/V) * (ka /(ka - k)) * (exp(-k*t) - exp(-ka*t))
``````

Typically these parameters, all of which must be positive, are estimated on the logarithmic scale so the model function is

``````function sdOral1C(φ, d)
k  = exp(φ)    # elimination rate constant from lk
ka = exp(φ)    # absorption rate constant from lka
V  = exp(φ)    # volume of distribution from lV
t  = d.Time
@. (d.Dose/V) * (ka/(ka - k)) * (exp(-k*t) - exp(-ka*t))
end
``````

If we pool all the data we get

``````m = fit!(NLregModel(sdOral1C, Theoph, (lk = -2.5, lka = 0.5, lV = -1.0), :Conc))
Nonlinear regression model (NLregModel)

Residual sum of squares: 274.4491345682938

──────────────────────────────────
Estimate  Std.Error  t value
──────────────────────────────────
lk   -2.52424   0.110347    -22.88
lka   0.399228  0.117537      3.40
lV   -0.724023  0.0485799   -14.90
──────────────────────────────────
``````

If we fit each subject’s data separately we get

``````julia> fits = [fit!(NLregModel(sdOral1C, sdf, params(m), :Conc)) for sdf in groupby(Theoph, :Subject)]
12-element Array{NLregModel{3,Float64},1}:
Nonlinear regression model (NLregModel)

Residual sum of squares: 2.44424021741903

──────────────────────────────────
Estimate  Std.Error  t value
──────────────────────────────────
lk   -2.30733   0.196063    -11.77
lka   0.151628  0.213966      0.71
lV   -0.665909  0.0980873    -6.79
──────────────────────────────────

Nonlinear regression model (NLregModel)

Residual sum of squares: 0.9965571862947799
...
Nonlinear regression model (NLregModel)

Residual sum of squares: 13.463469674674933

──────────────────────────────────
Estimate  Std.Error  t value
──────────────────────────────────
lk   -2.42548    0.277133    -8.75
lka   0.386278   0.296296     1.30
lV   -0.707117   0.127841    -5.53
──────────────────────────────────
``````

A mixed-effects model for these data is intermediate to the pooled model and the individual model. Each subject’s parameters are a sum of population values (called “fixed effects”) and subject-specific deviations (or “random effects”) which are assumed to be samples from a multivariate distribution (typically multivariate Gaussian). The purpose is to estimate the fixed effects and the variance-covariance parameters in the distribution of the random effects. This distribution regularizes the optimization of the individual’s parameter values. Instead of a nonlinear least squares fit, the optimization of each individual’s parameters becomes a penalized nonlinear least squares fit where the penalty depends on those variance-covariance parameters.

`NLregModel` is defined in https://github.com/dmbates/NLreg.jl which is also where I am working on an implementation of a nonlinear mixed-effects model.

1 Like