MWE fitting a model with a known parameter

Hello, I have a (soil respiration) model, with soil temperature and soil moisture as input variables, and a bunch of parameters, see the code below.

I am trying to figure out how to fit the model when one (or more) parameter is known in advance (so we don’t want to fit them).

Below is a MWE, with first what works (sometimes), and then what does not work.

I hope you guys have a solution for me!

function DAMM(x,p)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = p[5] # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[6] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

x_fake = [18.0 0.35; 22.0 0.22; 30.0 0.30; 35.0 0.39] # fake input data
p_test = [1, 6, 3.46, 2, 0.4, 1.25] # Parameters test
R_test = DAMM(x_fake, p_test) # function test. output = [3.0, 5.3, 9.4, 4.3]
R_fake = [1.2, 2.5, 5, 1.5] # fake output data. A bit lower than R_test

using LsqFit # Let's try to fit param to better match Resp
function fitDAMM(x, R) # 
  lb = [0.0, 0.0, 0.0, 0.0, 0.4, 0.0] # Porosity can't be below max(M)
  ub = [Inf, Inf, Inf, Inf, 1.0, Inf] # Porosity can't be above 1
  p_ini = [1, 6, 3.46, 2, 0.5, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out = fitDAMM(x_fake, R_fake) # works
R_out = DAMM(x_fake, p_out) # good fit! very close to R_fake.

# attempt with forced p[5]
poro_val = 0.4 # Porosity parameter (p[5]) is known
function fitDAMM_f(x, R, poro_val) 
  lb = [0.0, 0.0, 0.0, 0.0, poro_val, 0.0] # force p[5] to 0.4
  ub = [Inf, Inf, Inf, Inf, poro_val, Inf] # force p[5] to 0.4
  p_ini = [1, 6, 3.46, 2, poro_val, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out_f = fitDAMM_f(x_fake, R_fake, poro_val) # bug

I say the first “solution” works sometime, because sometime it doesn’t, as the algorithm of curve_fit still attempt parameters below the lower bound lb , which crashes the model (the model equations crash if porosity is lower than moisture).

What I need is either:
A way to force a parameter value
A way for the algorithm not to attempt parameter value below the lower bound

May I ask why you just don’t set porosity = 0.4 in the body of your DAMM() function?

hey @rafael.guerra , thanks for your reply!
are you suggesting:

function DAMM(x,p, poro_val)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = poro_val # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[5] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

x_fake = [18.0 0.35; 22.0 0.22; 30.0 0.30; 35.0 0.39] # fake input data
p_test = [1, 6, 3.46, 2, 1.25] # Parameters test
poro_val = 0.4
R_test = DAMM(x_fake, p_test, poro_val) # function test. output = [3.0, 5.3, 9.4, 4.3]
R_fake = [1.2, 2.5, 5, 1.5] # fake output data. A bit lower than R_test

using LsqFit # Let's try to fit param to better match Resp
function fitDAMM(x, R) # 
  lb = [0.0, 0.0, 0.0, 0.0, 0.4, 0.0] # Porosity can't be below max(M)
  ub = [Inf, Inf, Inf, Inf, 1.0, Inf] # Porosity can't be above 1
  p_ini = [1, 6, 3.46, 2, 0.5, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out = fitDAMM(x_fake, R_fake) 

?
I’ve thought about it, and tried, but I don’t do it because:

  1. it modify the DAMM function which is not ideal…
  2. it doesn’t work for curve_fit() as you can see the first argument of that function is DAMM and expect it to be a function of the form DAMM(x, p), not DAMM(x, p, other_argument)

but maybe I am missing something? Do you have a MWE that works?

I can force things by modifying the function to

function DAMM(x,p)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = poro_val # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[5] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

and define poro_val before calling the function… it works, but it’s really not ideal - it means the function needs a global variable defined outside of it to work

More importantly, putting poro_val inside the function without it being an argument creates problem when poro_val is an Observable which is why I really need to fix this issue

@rafael.guerra I love all your ideas! I’m testing them.
I didn’t know what multiple dispatch actually meant, and I just read about it and learned that it allows to redefine a function multiple times with different argument types. Sounds great and maybe it can be helpful - I’ll try
(I need to read more about multiple dispatch, does it allow to do more than just change the type of the arguments?)

I don’t know why I didn’t try conditional parameters. Sounds like it could work… not sure yet… I’m trying right now, I’ll post an MWE when I have one, if I find one!
Thanks!

do you know how to write the code to do that?

this worked:

poro_val = 0.4
function DAMM(x, p; porosity = poro_val)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = poro_val # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[5] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

x_fake = [18.0 0.35; 22.0 0.22; 30.0 0.30; 35.0 0.39] # fake input data
p_test = [1, 6, 3.46, 2, 1.25] # Parameters test
R_test = DAMM(x_fake, p_test) # poro_val not defined. output = [3.0, 5.3, 9.4, 4.3]
R_fake = [1.2, 2.5, 5, 1.5] # fake output data. A bit lower than R_test

using LsqFit # Let's try to fit param to better match Resp
function fitDAMM(x, R) # 
  lb = [0.0, 0.0, 0.0, 0.0, 0.4, 0.0] # Porosity can't be below max(M)
  ub = [Inf, Inf, Inf, Inf, 1.0, Inf] # Porosity can't be above 1
  p_ini = [1, 6, 3.46, 2, 0.5, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out = fitDAMM(x_fake, R_fake) # works!
R_out = DAMM(x_fake, p_out) 

however, what you suggested, I think, would be DAMM(x, p; p[5], poro_val) (?) it looks better, but I am not sure how to write it

EDIT: I see you just posted, I’ll try your code

it does make sense! despite my 2nd beer :wink:

1 Like

so the function became:

poro_val = 0.4
function DAMM(x,p; parm_num=0, value=poro_val)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	if parm_num == 0
	   porosity = p[5]    # as in your original code
	      # can be other elsif for other params
	elseif parm_num == 5
	      # modified original function with
	   porosity = value
	end
	#porosity = p[5] # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[6] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

I’ll test the whole thing (the actual project using this is much bigger https://github.com/CUPofTEAproject/www.cupoftea.earth) tomorrow or on Monday, or my Fiance will kill me…
Thanks a lot for your help @rafael.guerra I’ll keep you posted!

I didn’t read the code carefully, but aren’t you after a constrained optimization problem? There are various Julia packages that allow optimization where some of the parameters are constrained between lower and upper limits for example. I think LsqFit.jl has an option to pass a vector lower = [-Inf, -Inf, ..., val, ..., -Inf] and upper = [Inf, Inf, ..., val, ..., Inf]?

Alternatively, you can create a wrapper function with fixed parameters?

foo(a, b, c) = a+b+c
bar(a, b) = foo(a, b, 3.0)
2 Likes

I don’t have a super clean code yet, but I have something that works now! thank you for your help @rafael.guerra and @juliohm

Sorry Alexis, I was not able to implement the idea above in LsqFit so I deleted my posts to keep the thread clean.

After searching LsqFit.jl github, it seems that you already got a solution here by @Gabriel_Kreindler, which is basically the same as @juliohm’s advice.

As a simple MWE, the following synthax seems to work fine to fix the 3rd parameter of a LsqFit.jl model with 4 parameters:

using LsqFit
@. model(t, p) = p[1] + p[2]*sin(2π*p[3]*t)*exp(-t*p[4])

tdata = LinRange(0,4,1000)
ydata = model(tdata, [1.0 2.0 30.0 0.5]) + 0.01*randn(length(tdata))
lb = [0, 0.0, 0.0, 0.0]
ub = [Inf, Inf, Inf,Inf]
p0 = [1.2, 1.2, 27.0, 0.1]
fit_bounds = curve_fit(model, tdata, ydata, p0, lower=lb, upper=ub)
fit_bounds.param    # 1.0003, 2.0001, 30.000, 0.4996

# Fix 3rd parameter (which disappears from constraints):
lb = [0, 0.0, 0.0]
ub = [Inf, Inf, Inf]
p0 = [1.2, 1.2, 0.1]
modelfixparam3(t,p) = model(t, (p[1], p[2], 30.0, p[3]))
fit_bounds2 = curve_fit(modelfixparam3, tdata, ydata, p0, lower=lb, upper=ub)
fit_bounds2.param  # 1.0000, 2.0013, 0.5004
2 Likes

yes, I couldn’t write a script (MWE) with @Gabriel_Kreindler solution for some reason (my lack of coding skills), I closed the issue on LsqFit.jl repo

No worries, that makes two of us.