Fitting generates an extreme standard deviation (Dataframes + Easyfit)

Hi, I am trying to fit some data with EasyFit.jl, but i get an unreasonable standard deviation. What could be the cause, and what could i do about it?

-------------------- Linear Fit -------------------

Equation: y = ax + b

With: a = 0.377317985376421 ± 0.3462958891395846
      b = 0.19760457694233774 ± 0.15049249899614964

( . . . )

---------------------------------------------------

It may be the way the data is quantized, but i don’t know enough about fits to be sure.

Here’s the part that’s making the fit:

using CSV
using DataFrames
using Chain
using EasyFit

df = DataFrame(CSV.File("data.csv"))
@chain df begin
  transform!(AsTable(:) => ByRow(row -> (row.d_mm/10)) => "d_cm")
  transform!(AsTable(:) => ByRow(row -> (row.d_cm/2)) => "r_cm")
  select!([:m_g, :d_cm, :r_cm])
end

fit = fitlinear(map(x->log10(x), df.m_g), map(x->log10(x), df.r_cm))

And the data (i would have uploaded it as a .csv but it seems it either isn’t alowed or i don’t have a high enough trust level. You’ll have to copy it to a file named data.csv):

m_g,d_mm
4.9735,61.5
4.9735,53.5
4.9735,47.5
4.9735,51.5
4.9735,57.5
4.9735,54.5
2.5225,43.5
2.5225,46.5
2.5225,51.5
2.5225,44.5
2.5225,47.5
2.5225,42.5
1.2575,33.5
1.2575,40.5
1.2575,42.5
1.2575,35.5
1.2575,39.5
1.2575,36.5
0.6285,25.5
0.6285,27.5
0.6285,26.5
0.6285,24.5
0.6285,25.5
0.6285,26.5
0.3185,20.5
0.3185,18.5
0.3185,21.5
0.3185,19.5
0.3185,18.5
0.3185,19.5

Can you update Easyfit? There was a bug in the computation of the standard errors of the coefficients that was fixed some days ago.

ps: with the latest version I get:

julia> fit = fitlinear(map(x->log10(x), df.m_g), map(x->log10(x), df.r_cm))
------------------- Linear Fit -------------

Equation: y = ax + b

With: a = 0.3773179853762126 ± 0.016897509173654884
      b = 0.19760457694234904 ± 0.007343282037427473

Correlation coefficient, R² = 0.9468307246156893
Average square residue = 0.0014301055663081255

Predicted Y: ypred = [0.46046772540810454, 0.46046772540810454, ...]
residues = [-0.027377394703331004, 0.03314393905085733, ...]

--------------------------------------------

Just finished recompiling and running, and in fact it works much better! Thank you!

1 Like