MWE fitting a model with a known parameter

AlexisRenchon · December 18, 2021, 8:48pm

Hello, I have a (soil respiration) model, with soil temperature and soil moisture as input variables, and a bunch of parameters, see the code below.

I am trying to figure out how to fit the model when one (or more) parameter is known in advance (so we don’t want to fit them).

Below is a MWE, with first what works (sometimes), and then what does not work.

I hope you guys have a solution for me!

function DAMM(x,p)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = p[5] # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[6] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

x_fake = [18.0 0.35; 22.0 0.22; 30.0 0.30; 35.0 0.39] # fake input data
p_test = [1, 6, 3.46, 2, 0.4, 1.25] # Parameters test
R_test = DAMM(x_fake, p_test) # function test. output = [3.0, 5.3, 9.4, 4.3]
R_fake = [1.2, 2.5, 5, 1.5] # fake output data. A bit lower than R_test

using LsqFit # Let's try to fit param to better match Resp
function fitDAMM(x, R) # 
  lb = [0.0, 0.0, 0.0, 0.0, 0.4, 0.0] # Porosity can't be below max(M)
  ub = [Inf, Inf, Inf, Inf, 1.0, Inf] # Porosity can't be above 1
  p_ini = [1, 6, 3.46, 2, 0.5, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out = fitDAMM(x_fake, R_fake) # works
R_out = DAMM(x_fake, p_out) # good fit! very close to R_fake.

# attempt with forced p[5]
poro_val = 0.4 # Porosity parameter (p[5]) is known
function fitDAMM_f(x, R, poro_val) 
  lb = [0.0, 0.0, 0.0, 0.0, poro_val, 0.0] # force p[5] to 0.4
  ub = [Inf, Inf, Inf, Inf, poro_val, Inf] # force p[5] to 0.4
  p_ini = [1, 6, 3.46, 2, poro_val, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out_f = fitDAMM_f(x_fake, R_fake, poro_val) # bug

I say the first “solution” works sometime, because sometime it doesn’t, as the algorithm of curve_fit still attempt parameters below the lower bound lb , which crashes the model (the model equations crash if porosity is lower than moisture).

What I need is either:
A way to force a parameter value
A way for the algorithm not to attempt parameter value below the lower bound

rafael.guerra · December 18, 2021, 9:57pm

May I ask why you just don’t set porosity = 0.4 in the body of your DAMM() function?

AlexisRenchon · December 18, 2021, 10:19pm

hey @rafael.guerra , thanks for your reply!
are you suggesting:

function DAMM(x,p, poro_val)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = poro_val # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[5] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

x_fake = [18.0 0.35; 22.0 0.22; 30.0 0.30; 35.0 0.39] # fake input data
p_test = [1, 6, 3.46, 2, 1.25] # Parameters test
poro_val = 0.4
R_test = DAMM(x_fake, p_test, poro_val) # function test. output = [3.0, 5.3, 9.4, 4.3]
R_fake = [1.2, 2.5, 5, 1.5] # fake output data. A bit lower than R_test

using LsqFit # Let's try to fit param to better match Resp
function fitDAMM(x, R) # 
  lb = [0.0, 0.0, 0.0, 0.0, 0.4, 0.0] # Porosity can't be below max(M)
  ub = [Inf, Inf, Inf, Inf, 1.0, Inf] # Porosity can't be above 1
  p_ini = [1, 6, 3.46, 2, 0.5, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out = fitDAMM(x_fake, R_fake)

?
I’ve thought about it, and tried, but I don’t do it because:

it modify the DAMM function which is not ideal…
it doesn’t work for curve_fit() as you can see the first argument of that function is DAMM and expect it to be a function of the form DAMM(x, p), not DAMM(x, p, other_argument)

but maybe I am missing something? Do you have a MWE that works?

I can force things by modifying the function to

function DAMM(x,p)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = poro_val # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[5] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

and define poro_val before calling the function… it works, but it’s really not ideal - it means the function needs a global variable defined outside of it to work

AlexisRenchon · December 18, 2021, 10:35pm

More importantly, putting poro_val inside the function without it being an argument creates problem when poro_val is an Observable which is why I really need to fix this issue

AlexisRenchon · December 18, 2021, 11:09pm

@rafael.guerra I love all your ideas! I’m testing them.
I didn’t know what multiple dispatch actually meant, and I just read about it and learned that it allows to redefine a function multiple times with different argument types. Sounds great and maybe it can be helpful - I’ll try
(I need to read more about multiple dispatch, does it allow to do more than just change the type of the arguments?)

I don’t know why I didn’t try conditional parameters. Sounds like it could work… not sure yet… I’m trying right now, I’ll post an MWE when I have one, if I find one!
Thanks!

AlexisRenchon · December 18, 2021, 11:12pm

do you know how to write the code to do that?

AlexisRenchon · December 18, 2021, 11:20pm

this worked:

poro_val = 0.4
function DAMM(x, p; porosity = poro_val)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	porosity = poro_val # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[5] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

x_fake = [18.0 0.35; 22.0 0.22; 30.0 0.30; 35.0 0.39] # fake input data
p_test = [1, 6, 3.46, 2, 1.25] # Parameters test
R_test = DAMM(x_fake, p_test) # poro_val not defined. output = [3.0, 5.3, 9.4, 4.3]
R_fake = [1.2, 2.5, 5, 1.5] # fake output data. A bit lower than R_test

using LsqFit # Let's try to fit param to better match Resp
function fitDAMM(x, R) # 
  lb = [0.0, 0.0, 0.0, 0.0, 0.4, 0.0] # Porosity can't be below max(M)
  ub = [Inf, Inf, Inf, Inf, 1.0, Inf] # Porosity can't be above 1
  p_ini = [1, 6, 3.46, 2, 0.5, 1.25] # same as p
  fit = curve_fit(DAMM, x, R, p_ini, lower=lb, upper=ub)
  params = coef(fit)
  return params
end

p_out = fitDAMM(x_fake, R_fake) # works!
R_out = DAMM(x_fake, p_out)

however, what you suggested, I think, would be DAMM(x, p; p[5], poro_val) (?) it looks better, but I am not sure how to write it

EDIT: I see you just posted, I’ll try your code

AlexisRenchon · December 18, 2021, 11:26pm

it does make sense! despite my 2nd beer

AlexisRenchon · December 18, 2021, 11:35pm

so the function became:

poro_val = 0.4
function DAMM(x,p; parm_num=0, value=poro_val)
     # Constants
	R = 8.314472e-3 # Universal gas constant, kJ K⁻¹ mol⁻¹
	O₂ₐ = 0.209 # Volume of O₂ in the air, L L⁻¹
	Dₗᵢ = 3.17 # Diffusion coefficient of substrate in liquid phase, dimensionless
	Dₒₐ = 1.67 # Diffusion coefficient of oxygen in air, dimensionless
	pₛₓ = 0.024 # Fraction of soil carbon that is considered soluble
     # Independent variables
	Tₛ = x[:, 1] # Soil temperature, °C
	θ = x[:, 2] # Soil moisture, m³ m⁻³
     # Parameters 
	αₛₓ = 1e8*p[1] # Pre-exponential factor, mgC cm⁻³ h⁻¹
     	Eaₛₓ = 1e1*p[2] # Activation energy, kJ mol⁻¹
	kMₛₓ = 1e-8*p[3] # Michaelis constant, gC cm⁻³
    	kMₒ₂ = 1e-3*p[4] # Michaelis constant for O₂, L L⁻¹
	if parm_num == 0
	   porosity = p[5]    # as in your original code
	      # can be other elsif for other params
	elseif parm_num == 5
	      # modified original function with
	   porosity = value
	end
	#porosity = p[5] # 1 - soil buld density / soil particle density
	Sxₜₒₜ = 1e-2*p[6] # Total soil carbon, gC cm⁻³
     # DAMM model
	Vmax = @. (αₛₓ * exp(-Eaₛₓ/(R * (273.15 + Tₛ)))) # Maximum potential rate of respiration
     	Sₓ = @. pₛₓ * Sxₜₒₜ * Dₗᵢ * θ^3 # All soluble substrate, gC cm⁻³
	MMₛₓ = @. Sₓ / (kMₛₓ + Sₓ) # Availability of substrate factor, 0-1 
	O₂ = @. Dₒₐ * O₂ₐ * ((porosity - θ)^(4/3)) # Oxygen concentration
	MMₒ₂ = @. O₂ / (kMₒ₂ + O₂) # Oxygen limitation factor, 0-1
	Resp = @. Vmax * MMₛₓ * MMₒ₂ # Respiration, mg C cm⁻³ hr⁻¹
	Respc = Resp .* 2314.8148 # Respiration, μmol CO₂ m⁻² s⁻¹
end

I’ll test the whole thing (the actual project using this is much bigger https://github.com/CUPofTEAproject/www.cupoftea.earth) tomorrow or on Monday, or my Fiance will kill me…
Thanks a lot for your help @rafael.guerra I’ll keep you posted!

juliohm · December 18, 2021, 11:43pm

I didn’t read the code carefully, but aren’t you after a constrained optimization problem? There are various Julia packages that allow optimization where some of the parameters are constrained between lower and upper limits for example. I think LsqFit.jl has an option to pass a vector lower = [-Inf, -Inf, ..., val, ..., -Inf] and upper = [Inf, Inf, ..., val, ..., Inf]?

Alternatively, you can create a wrapper function with fixed parameters?

foo(a, b, c) = a+b+c
bar(a, b) = foo(a, b, 3.0)

AlexisRenchon · December 19, 2021, 12:58am

I don’t have a super clean code yet, but I have something that works now! thank you for your help @rafael.guerra and @juliohm

rafael.guerra · December 19, 2021, 9:37am

Sorry Alexis, I was not able to implement the idea above in LsqFit so I deleted my posts to keep the thread clean.

After searching LsqFit.jl github, it seems that you already got a solution here by @Gabriel_Kreindler, which is basically the same as @juliohm’s advice.

rafael.guerra · December 19, 2021, 11:37am

As a simple MWE, the following synthax seems to work fine to fix the 3rd parameter of a LsqFit.jl model with 4 parameters:

using LsqFit
@. model(t, p) = p[1] + p[2]*sin(2π*p[3]*t)*exp(-t*p[4])

tdata = LinRange(0,4,1000)
ydata = model(tdata, [1.0 2.0 30.0 0.5]) + 0.01*randn(length(tdata))
lb = [0, 0.0, 0.0, 0.0]
ub = [Inf, Inf, Inf,Inf]
p0 = [1.2, 1.2, 27.0, 0.1]
fit_bounds = curve_fit(model, tdata, ydata, p0, lower=lb, upper=ub)
fit_bounds.param    # 1.0003, 2.0001, 30.000, 0.4996

# Fix 3rd parameter (which disappears from constraints):
lb = [0, 0.0, 0.0]
ub = [Inf, Inf, Inf]
p0 = [1.2, 1.2, 0.1]
modelfixparam3(t,p) = model(t, (p[1], p[2], 30.0, p[3]))
fit_bounds2 = curve_fit(modelfixparam3, tdata, ydata, p0, lower=lb, upper=ub)
fit_bounds2.param  # 1.0000, 2.0013, 0.5004

AlexisRenchon · December 19, 2021, 4:55pm

yes, I couldn’t write a script (MWE) with @Gabriel_Kreindler solution for some reason (my lack of coding skills), I closed the issue on LsqFit.jl repo

rafael.guerra · December 19, 2021, 5:02pm

No worries, that makes two of us.

Topic		Replies	Views
Differential Equations and Parameter Estimation General Usage	21	3529	July 2, 2019
Fitting parameters together with initial conditions of an ODE to real data Modelling & Simulations question , turing , differentialequation	1	342	January 20, 2023
ADAM solver returns parameter values resulting in poor fit for my SDE model New to Julia sde , optimization , interpolations , differentialequation	4	142	December 3, 2024
ModelingToolkit, SciMLSensitivity: Computed parameter? Modelling & Simulations question , optimization , modelingtoolkit	1	61	December 12, 2024
MTK & missing parameters, MWE Modelling & Simulations	3	372	April 12, 2023

MWE fitting a model with a known parameter

Related topics