Changing the binding of a variable in a closure

I am labeling this question as “statistics” because the motivation comes from a statistical application. The tl: dr version is that I want to change the binding of a variable in a closure within a loop. The function itself is defined outside the loop. I don’t have a lot of flexibility in defining the function because I am passing it to derivative evaluation functions in the ForwardDiff package and those must be unary functions. Right now my approach is to declare the variable’s name as global before I define the function. I’m wondering if there is a more aesthetically pleasing approach.

So the background is that a nonlinear regression model function depends upon parameters and observed data. Generally the parameters are a relatively short vector and the data are columns in a table, like a DataFrame. One of the simple (but real) examples in Bates and Watts (Wiley, 1988), Nonlinear Regression Analysis and Its Applications is data from an enzyme kinetics experiment

julia> data = DataFrame(conc = repeat([0.02,0.06,0.11,0.22,0.56,1.1], inner=2),
           rate = float([76,47,97,107,123,139,159,152,191,201,207,200]))
12×2 DataFrame
│ Row │ conc    │ rate    │
│     │ Float64 │ Float64 │
│ 1   │ 0.02    │ 76.0    │
│ 2   │ 0.02    │ 47.0    │
│ 3   │ 0.06    │ 97.0    │
│ 4   │ 0.06    │ 107.0   │
│ 5   │ 0.11    │ 123.0   │
│ 6   │ 0.11    │ 139.0   │
│ 7   │ 0.22    │ 159.0   │
│ 8   │ 0.22    │ 152.0   │
│ 9   │ 0.56    │ 191.0   │
│ 10  │ 0.56    │ 201.0   │
│ 11  │ 1.1     │ 207.0   │
│ 12  │ 1.1     │ 200.0   │

for which the Michaelis-Menten model for reaction rate as a function of substrate concentration would be appropriate.

julia> function MicMen(pars, data)
           Vm = pars[1]
           K = pars[2]
           conc = data.conc
           @. Vm * conc / (K + conc)
MicMen (generic function with 1 method)

For these data, values Vm = 200. and K = 0.05 give the predicted response

julia> show(MicMen([200., 0.05], data))
[57.14285714285714, 57.14285714285714, 109.0909090909091, 109.0909090909091, 137.5, 137.5, 162.96296296296296, 162.96296296296296, 183.60655737704917, 183.60655737704917, 191.30434782608697, 191.30434782608697]

I want to use methods from ForwardDiff to evaluate the Jacobian matrix but the function argument to ForwardDiff.jacobian! must be a function of pars only. I can use a closure and generate

f(x) = MicMen(x, data)

in a context where data is defined so that is fine.

However, I also want to do this for several related groups of responses, say saved in a GroupedDataFrame, where the value of pars will change between groups. This is for fitting nonlinear mixed-effects models, described in Pinheiro and Bates (Springer, 2000), Mixed-Effects Models in S and S-PLUS. As I loop over the groups I need the binding of data in the closure to change to each SubDataFrame. I am currently doing this by assigning a global variable _sdf before defining f and changing the global variable’s binding in the loop.

Are there better approaches? I feel that I should be able to do something with let but I can’t seem to work out how to do so.

Put the variable in some sort of container (like a Ref) and update that instead?


When in doubt, forget that closures exist and do it by hand. E.g.

julia> struct F{T}

julia> function (f::F)(par)
       conc = f.conc
       @. Vm * conc / (K + conc)

Now, each instance of F, like e.g. F(data.conc), is a valid input to forward diff. This is how closures are internally implemented.

Changing bindings in closures is supported, but almost always a bad idea. Instead, construct a new closure.

edit: Or do what @kristoffer.carlsson said, depending on what you actually want (I don’t know whether I understood what your issue is).


In case someone reading this later needs more detail, I created the Ref from the first SubDataFrame, then define the function for ForwardDiff.jacobian! dereferencing the Ref. If data is the GroupedDataFrame it looks like

Rdf = Ref(first(data))
f(pars) = MicMen(pars, Rdf[])

Then the loop looks like

for (res, sdf) in zip(results, data)
    Rdf[] = sdf
    ... manipulate the pars to get the appropriate value
    ForwardDiff.jacobian!(res, f, pars, cfg)

where cfg is a (common) JacobianConfig and results is a preallocated vector of the form

results = [DiffResults.JacobianResult(sdf.age, pars) for sdf in data]

Is it the same pars for all subdataframes? It is unclear what the role of pars is, can you provide an MWE for the complete calculation?

The values of pars change from group to group. The groups of observations correspond to experimental units of some sort in the experiment. In a pharmacokinetics study there will be several participants each of whom receives dosages of a drug following which several measurements of serum concentrations over time are made.

tl; dr As an example, in R, datasets::Theoph is data from such a study. A compartment model for concentration as a function of time depends upon three scalar parameters for each subject: ka, the absorption rate constant, k, the elimination rate constant and V, the effective volume of distribution, as

(dose/V) * (ka /(ka - k)) * (exp(-k*t) - exp(-ka*t))

Typically these parameters, all of which must be positive, are estimated on the logarithmic scale so the model function is

function sdOral1C(φ, d)
    k  = exp(φ[1])    # elimination rate constant from lk
    ka = exp(φ[2])    # absorption rate constant from lka
    V  = exp(φ[3])    # volume of distribution from lV
    t  = d.Time
    @. (d.Dose/V) * (ka/(ka - k)) * (exp(-k*t) - exp(-ka*t))

If we pool all the data we get

m = fit!(NLregModel(sdOral1C, Theoph, (lk = -2.5, lka = 0.5, lV = -1.0), :Conc))
Nonlinear regression model (NLregModel)

 Residual sum of squares: 274.4491345682938

      Estimate  Std.Error  t value
lk   -2.52424   0.110347    -22.88
lka   0.399228  0.117537      3.40
lV   -0.724023  0.0485799   -14.90

If we fit each subject’s data separately we get

julia> fits = [fit!(NLregModel(sdOral1C, sdf, params(m), :Conc)) for sdf in groupby(Theoph, :Subject)]
12-element Array{NLregModel{3,Float64},1}:
 Nonlinear regression model (NLregModel)

 Residual sum of squares: 2.44424021741903

      Estimate  Std.Error  t value
lk   -2.30733   0.196063    -11.77
lka   0.151628  0.213966      0.71
lV   -0.665909  0.0980873    -6.79
 Nonlinear regression model (NLregModel)

 Residual sum of squares: 0.9965571862947799
 Nonlinear regression model (NLregModel)

 Residual sum of squares: 13.463469674674933

      Estimate  Std.Error  t value
lk   -2.42548    0.277133    -8.75
lka   0.386278   0.296296     1.30
lV   -0.707117   0.127841    -5.53

A mixed-effects model for these data is intermediate to the pooled model and the individual model. Each subject’s parameters are a sum of population values (called “fixed effects”) and subject-specific deviations (or “random effects”) which are assumed to be samples from a multivariate distribution (typically multivariate Gaussian). The purpose is to estimate the fixed effects and the variance-covariance parameters in the distribution of the random effects. This distribution regularizes the optimization of the individual’s parameter values. Instead of a nonlinear least squares fit, the optimization of each individual’s parameters becomes a penalized nonlinear least squares fit where the penalty depends on those variance-covariance parameters.

NLregModel is defined in which is also where I am working on an implementation of a nonlinear mixed-effects model.