Changing the binding of a variable in a closure

dmbates · July 12, 2019, 6:54pm

I am labeling this question as “statistics” because the motivation comes from a statistical application. The tl: dr version is that I want to change the binding of a variable in a closure within a loop. The function itself is defined outside the loop. I don’t have a lot of flexibility in defining the function because I am passing it to derivative evaluation functions in the ForwardDiff package and those must be unary functions. Right now my approach is to declare the variable’s name as global before I define the function. I’m wondering if there is a more aesthetically pleasing approach.

So the background is that a nonlinear regression model function depends upon parameters and observed data. Generally the parameters are a relatively short vector and the data are columns in a table, like a DataFrame. One of the simple (but real) examples in Bates and Watts (Wiley, 1988), Nonlinear Regression Analysis and Its Applications is data from an enzyme kinetics experiment

julia> data = DataFrame(conc = repeat([0.02,0.06,0.11,0.22,0.56,1.1], inner=2),
           rate = float([76,47,97,107,123,139,159,152,191,201,207,200]))
12×2 DataFrame
│ Row │ conc    │ rate    │
│     │ Float64 │ Float64 │
├─────┼─────────┼─────────┤
│ 1   │ 0.02    │ 76.0    │
│ 2   │ 0.02    │ 47.0    │
│ 3   │ 0.06    │ 97.0    │
│ 4   │ 0.06    │ 107.0   │
│ 5   │ 0.11    │ 123.0   │
│ 6   │ 0.11    │ 139.0   │
│ 7   │ 0.22    │ 159.0   │
│ 8   │ 0.22    │ 152.0   │
│ 9   │ 0.56    │ 191.0   │
│ 10  │ 0.56    │ 201.0   │
│ 11  │ 1.1     │ 207.0   │
│ 12  │ 1.1     │ 200.0   │

for which the Michaelis-Menten model for reaction rate as a function of substrate concentration would be appropriate.

julia> function MicMen(pars, data)
           Vm = pars[1]
           K = pars[2]
           conc = data.conc
           @. Vm * conc / (K + conc)
       end
MicMen (generic function with 1 method)

For these data, values Vm = 200. and K = 0.05 give the predicted response

julia> show(MicMen([200., 0.05], data))
[57.14285714285714, 57.14285714285714, 109.0909090909091, 109.0909090909091, 137.5, 137.5, 162.96296296296296, 162.96296296296296, 183.60655737704917, 183.60655737704917, 191.30434782608697, 191.30434782608697]

I want to use methods from ForwardDiff to evaluate the Jacobian matrix but the function argument to ForwardDiff.jacobian! must be a function of pars only. I can use a closure and generate

f(x) = MicMen(x, data)

in a context where data is defined so that is fine.

However, I also want to do this for several related groups of responses, say saved in a GroupedDataFrame, where the value of pars will change between groups. This is for fitting nonlinear mixed-effects models, described in Pinheiro and Bates (Springer, 2000), Mixed-Effects Models in S and S-PLUS. As I loop over the groups I need the binding of data in the closure to change to each SubDataFrame. I am currently doing this by assigning a global variable _sdf before defining f and changing the global variable’s binding in the loop.

Are there better approaches? I feel that I should be able to do something with let but I can’t seem to work out how to do so.

kristoffer.carlsson · July 12, 2019, 7:03pm

Put the variable in some sort of container (like a Ref) and update that instead?

foobar_lv2 · July 12, 2019, 7:05pm

When in doubt, forget that closures exist and do it by hand. E.g.

julia> struct F{T}
       conc::T
       end

julia> function (f::F)(par)
       VM=pars[1]
       K=pars(2)
       conc = f.conc
       @. Vm * conc / (K + conc)
       end

Now, each instance of F, like e.g. F(data.conc), is a valid input to forward diff. This is how closures are internally implemented.

Changing bindings in closures is supported, but almost always a bad idea. Instead, construct a new closure.

edit: Or do what @kristoffer.carlsson said, depending on what you actually want (I don’t know whether I understood what your issue is).

dmbates · July 12, 2019, 8:51pm

In case someone reading this later needs more detail, I created the Ref from the first SubDataFrame, then define the function for ForwardDiff.jacobian! dereferencing the Ref. If data is the GroupedDataFrame it looks like

Rdf = Ref(first(data))
f(pars) = MicMen(pars, Rdf[])

Then the loop looks like

for (res, sdf) in zip(results, data)
    Rdf[] = sdf
    ... manipulate the pars to get the appropriate value
    ForwardDiff.jacobian!(res, f, pars, cfg)
end

where cfg is a (common) JacobianConfig and results is a preallocated vector of the form

results = [DiffResults.JacobianResult(sdf.age, pars) for sdf in data]

Tamas_Papp · July 13, 2019, 6:06am

Is it the same pars for all subdataframes? It is unclear what the role of pars is, can you provide an MWE for the complete calculation?

dmbates · July 13, 2019, 6:22pm

The values of pars change from group to group. The groups of observations correspond to experimental units of some sort in the experiment. In a pharmacokinetics study there will be several participants each of whom receives dosages of a drug following which several measurements of serum concentrations over time are made.

tl; dr As an example, in R, datasets::Theoph is data from such a study. A compartment model for concentration as a function of time depends upon three scalar parameters for each subject: ka, the absorption rate constant, k, the elimination rate constant and V, the effective volume of distribution, as

(dose/V) * (ka /(ka - k)) * (exp(-k*t) - exp(-ka*t))

Typically these parameters, all of which must be positive, are estimated on the logarithmic scale so the model function is

function sdOral1C(φ, d)
    k  = exp(φ[1])    # elimination rate constant from lk
    ka = exp(φ[2])    # absorption rate constant from lka
    V  = exp(φ[3])    # volume of distribution from lV
    t  = d.Time
    @. (d.Dose/V) * (ka/(ka - k)) * (exp(-k*t) - exp(-ka*t))
end

If we pool all the data we get

m = fit!(NLregModel(sdOral1C, Theoph, (lk = -2.5, lka = 0.5, lV = -1.0), :Conc))
Nonlinear regression model (NLregModel)

 Residual sum of squares: 274.4491345682938

──────────────────────────────────
      Estimate  Std.Error  t value
──────────────────────────────────
lk   -2.52424   0.110347    -22.88
lka   0.399228  0.117537      3.40
lV   -0.724023  0.0485799   -14.90
──────────────────────────────────

If we fit each subject’s data separately we get

julia> fits = [fit!(NLregModel(sdOral1C, sdf, params(m), :Conc)) for sdf in groupby(Theoph, :Subject)]
12-element Array{NLregModel{3,Float64},1}:
 Nonlinear regression model (NLregModel)

 Residual sum of squares: 2.44424021741903

──────────────────────────────────
      Estimate  Std.Error  t value
──────────────────────────────────
lk   -2.30733   0.196063    -11.77
lka   0.151628  0.213966      0.71
lV   -0.665909  0.0980873    -6.79
──────────────────────────────────
   
 Nonlinear regression model (NLregModel)

 Residual sum of squares: 0.9965571862947799
...
 Nonlinear regression model (NLregModel)

 Residual sum of squares: 13.463469674674933

──────────────────────────────────
      Estimate  Std.Error  t value
──────────────────────────────────
lk   -2.42548    0.277133    -8.75
lka   0.386278   0.296296     1.30
lV   -0.707117   0.127841    -5.53
──────────────────────────────────

A mixed-effects model for these data is intermediate to the pooled model and the individual model. Each subject’s parameters are a sum of population values (called “fixed effects”) and subject-specific deviations (or “random effects”) which are assumed to be samples from a multivariate distribution (typically multivariate Gaussian). The purpose is to estimate the fixed effects and the variance-covariance parameters in the distribution of the random effects. This distribution regularizes the optimization of the individual’s parameter values. Instead of a nonlinear least squares fit, the optimization of each individual’s parameters becomes a penalized nonlinear least squares fit where the penalty depends on those variance-covariance parameters.

NLregModel is defined in GitHub - dmbates/NLreg.jl: Nonlinear regression in Julia which is also where I am working on an implementation of a nonlinear mixed-effects model.