Non-linear analysis

SergeantMike67 · October 28, 2024, 8:15pm

Having a bit of an issue and I don’t quite know where to turn.

I have insect emergence data that fits a Gompertz model quite well.

Example Data
Row │ ED Wasp
│ Float64 Float64
─────┼──────────────────
1 │ 14.375 2.0
2 │ 14.875 1.0
3 │ 15.375 10.0
4 │ 15.5417 11.0
5 │ 15.7083 1.0
6 │ 15.875 9.0
7 │ 16.5417 351.0
8 │ 16.7083 4.0
9 │ 16.875 153.0
10 │ 17.375 36.0
11 │ 17.5417 110.0
12 │ 17.7083 4.0
13 │ 18.375 55.0
14 │ 18.5417 105.0
15 │ 18.7083 1.0
16 │ 19.375 8.0
17 │ 19.5417 41.0
18 │ 19.7083 3.0
19 │ 20.5417 8.0
20 │ 21.5417 5.0
21 │ 21.7083 1.0
22 │ 21.875 1.0
23 │ 22.375 3.0
24 │ 22.5417 5.0
25 │ 24.5417 2.0

I am currently using RCall to determine the coefficients of the Gompertz Model based on the data. The equation R uses is below

y~a*exp(-b*exp(-c*x)
where
a is the asymptote
b is displacement along the x axis
c is the growth rate

all is well and good to this point.

the kink comes from a paper i found that uses the SpSS version of the equation

`y~aexp(-exp(-c(x-m)))’
where
a is the asymptote
b is an implied 1
c is slope of the growing part of the curve
m is the inflection point or where the growth goes exponetial.

What I would like to do is use the SPSS version without having to purchase and learn how to use SPSS to determine the fit of a line with the data.

I have no idea if any of this makes sense but I am happy to clarify and answer any questions.

Thank you so much
Mike

mthelm85 · October 28, 2024, 9:52pm

There are various libraries for fitting non-linear functions. Maybe give LsqFit.jl a try?

PeterSimon · October 28, 2024, 10:50pm

The expressions used by R and SPSS are equivalent. You can easily derive the SPSS coefficients from the R coefficients.

Suppose a, b, and c are the coefficients found from your R call, and that a', c', and m' are the coefficients that would be returned by SPSS. It is simple to show that a' = a, c' = c, and m' = \frac{\text{ln} \, b}{c}.

BLI · October 28, 2024, 10:52pm

How should your data be understood? Like:

D = [14.375 2.0
14.875 1.0
15.375 10.0
15.5417 11.0
15.7083 1.0
15.875 9.0
16.5417 351.0
16.7083 4.0
16.875 153.0
17.375 36.0
17.5417 110.0
17.7083 4.0
18.375 55.0
18.5417 105.0
18.7083 1.0
19.375 8.0
19.5417 41.0
19.7083 3.0
20.5417 8.0
21.5417 5.0
21.7083 1.0
21.875 1.0
22.375 3.0
22.5417 5.0
24.5417 2.0]

??

If I plot these data,

using Plots
scatter(D[:,1], D[:,2])

or

Does this make sense?

SergeantMike67 · October 29, 2024, 2:40pm

no because I am looking at the cumlative information

so I sent raw data and forgot to explain that part. Too many things going on at the same time.

15.7083 1.0
15.875 10.0
16.5417 361.0
16.0783 365.0
etc

SergeantMike67 · October 29, 2024, 2:44pm

Good gawd Peter, you have provided the answer that myself (not exactly a high bar here), a tenured professor of Mathematics, a Ph.D. student in Mathematics and a LOT of reading on the internet couldn’t provide. The conversions between the two equations. Thank you!

Edit: I noticed that I really kinda botched the the SPSS version. It should read
’ y~a+c*(exp(-exp(-b*(t-m)))’

where
a+c = asymptote
b= slope of the growing part of the curve
m= inflexion point

BLI · October 29, 2024, 4:26pm

A “quick and dirty” use of LsqFit.jl gave this:

Of course, I don’t know what your data shows (wasp arrival?? after number of days??), so you would probably use different text for the plot + labels + abscissa.

Some code:

# Importing packages
using LsqFit, Plots

# Defining some parameters for plotting
LW1 = 2.5
LC1 = :darkorange
LC2 = :red
LC3 = :blue
LS1 = :solid
LS2 = :dash

# Data
data = [14.375 2.0
14.875 1.0
15.375 10.0
15.5417 11.0
15.7083 1.0
15.875 9.0
16.5417 351.0
16.7083 4.0
16.875 153.0
17.375 36.0
17.5417 110.0
17.7083 4.0
18.375 55.0
18.5417 105.0
18.7083 1.0
19.375 8.0
19.5417 41.0
19.7083 3.0
20.5417 8.0
21.5417 5.0
21.7083 1.0
21.875 1.0
22.375 3.0
22.5417 5.0
24.5417 2.0]

x = data[:,1]
y = cumsum(data[:,2])
fg = plot(x,y; seriestype=:steppre, lw=LW1, lc=LC1, label="cumulative wasp arrival")

# Gompertz model
function model_1(x,p)
    a,b,c = p
    return a*exp.(-b*exp.(-c*x))
end

p0_1 = [900.0,16,1]
fit_1 = curve_fit(model_1, x, y, p0_1)
param_1 = fit_1.param

cov_1 = estimate_covar(fit_1)

conf_int_1 = confidence_interval(fit_1,0.1)

y_hat_1(x) = model_1(x,param_1)

x_range = range(minimum(x),maximum(x),length=100)

plot!(fg, x_range,y_hat_1(x_range); lw=LW1, lc=LC2, label="Gompertz model")

# SpSS model
function model_2(x,p)
    a,b,c,m = p
    return a .+ c*exp.(-exp.(-b*(x .- m)))
end

p0_2 = [600.0, 1, 300, 16]

fit_2 = curve_fit(model_2, x, y, p0_2)
param_2 = fit_2.param

cov_2 = estimate_covar(fit_2)

conf_int_2 = confidence_interval(fit_2,0.1)

y_hat_2(x) = model_2(x,param_2)

plot!(fg,x_range,y_hat_2(x_range); lw=LW1, lc=LC3, ls=LS2,label="SpSS model")
plot!(xlabel="days", title="Cumulative wasp arrival", frame=:box, legend=:bottomright)

SergeantMike67 · October 29, 2024, 5:20pm

Interesting. So using
fit.param

would give me the best estimated parameters (a,b,c,m) for the curve that it calculated?

PeterSimon · October 29, 2024, 5:22pm

Of course this invalidates the relation I found earlier.

BLI · October 29, 2024, 5:27pm

Yes.

SergeantMike67 · October 29, 2024, 6:36pm

figured it did, Sorry

SergeantMike67 · October 29, 2024, 6:37pm

in that order?

BLI · October 29, 2024, 6:58pm

Not quite sure what you mean, but I assume that you mean fit.param with fit first and param last? OK – let me be a little bit more precise.

Function curve_fit with arguments (i) model function (e.g., model_1), (ii) “x-data” (e.g., x), (iii) “y-data” (e.g., y), and (iv) initial parameter guess (e.g., p0_1) in that order, returns a data structure.
By writing fit_1 = curve_fit(model_1, x, y, p0_1), I give the returned data structure the name fit_1.
The returned data structure has a so-called field with name param. This field contains the fitted parameter. The syntax for getting the content of field param in the data structure (which I have named fit_1), is fit_1.param.

So – no matter what I choose to call the fitted data structure, the field name is always param – the field name was chosen by the developers of LsqFit and we have to follow their choice. So if I instead had called the returned data structure Gompertz_estimate by using statement

Gompertz_estimate = curve_fit(model_1,x,y,p0_1)

the field name containing the estimated parameter is still the same, and is extracted (this time) by statement:

Gompertz_estimate.param

SergeantMike67 · October 29, 2024, 7:19pm

right I understand this but when I look at the values in fit_2 I get:
`fit_1

-15.399481864169104
1.1706823975674077
933.2300788361196
16.52300592020053

I don’t which value is for which parameter. Is a=-15.399? or is that m?

BLI · October 29, 2024, 7:40pm

Ah. The order is given by the order in models, e.g.:

# Gompertz model
function model_1(x,p)
    a,b,c = p
    return a*exp.(-b*exp.(-c*x))
end

and

# SpSS model
function model_2(x,p)
    a,b,c,m = p
    return a .+ c*exp.(-exp.(-b*(x .- m)))
end

So for model_2, a,b,c,m = p implies that

a = -15.399…
b = 1.170…
c = 933.23…
m = 16.523…

SergeantMike67 · October 29, 2024, 7:55pm

that’s what I thought but when I use those parameters back into the equation I don’t get even close to the right answer for any given x in the data I posted.

BTW this is wasps emerging from a gall data.

BLI · October 29, 2024, 8:02pm

What does your function look like? The one you put the parameters back into?

SergeantMike67 · October 29, 2024, 8:22pm

I used the code you posted and am using model 2.

BLI · October 29, 2024, 9:34pm

OK… again, here are the curves:

The (dark) orange curve is the cumulative of the output data you provided. Clearly, the fit is not good around abscissa value 16 (as an example). Also, at ca. 17, there is an offset. For larger abscissa values than 20, the fit is decent, I’d say.

It is not possible to work magic: the fitted curves (= the models) will never be equal to reality (the orange curve). But they may give an indication.

It may be possible to get better fit for lower abscissa values by using a different loss function than what LsqFit uses (by default – don’t know if it is possible to change that). Or by using different models.

I’m not a domain expert, so I don’t know what models are used in your field.

If you want to try with other models, you can do so. Below, I’ve added the logistic function:

Topic		Replies	Views
Curve fitting with NonlinearSolve.jl General Usage curve-fitting	9	483	December 24, 2024
Fitting dose response curves General Usage	35	4654	August 7, 2018
Plot fitted curves New to Julia question , curve-fitting	6	1495	March 21, 2022
Alternatives to LsqFit for nonlinear curve fitting? General Usage curve-fitting	12	2898	December 21, 2023
Lsqfit Nonlinear Regression Using Two Parameters: One Parameter Is Stuck At The Initial Parameter Value General Usage question , regression , curve-fitting , nonlinear , lsqfit	0	211	December 24, 2023

Non-linear analysis

Related topics