Adding logarithmic or polynomial regression to series

Good Day:

I have been attempting to fit a logarithmic or
polynomial trendline to the following series:

I am currently using GLM.jl and plotting with StatsPlots.jl, but do
not believe there are parameters I can adjust to plot!(kwarg…) to
the existing series.

I will need to adjust the y and x scale to :log10? Do I need to
define p0? Any suggestions?

Thank you,

Please post code :grinning:

I don’t understand your question - are you asking how to hit a polynomial regression in GLM or have you already done that and are struggling to plot a linear and polynomial model on the same graph?

@nilshg Thanks for your questions, Nils.

I am struggling to overlay a polynomial | logarithmic
regression over the series. The linear model here
does not quite fit the data and was simply the built
in param within the scatter( … smooth=:true) method.

@ptoche – The code I am using to generate the graph

using StatsPlots, PlotlyJS

stp.scatter(DF[!, :Sales], DF[!, :Δ], 
	        label="Intersection", xlab="Sales",
	        xlim=(0.6, 1.01),
	        lw = 2,

Not sure what the problem is, but just in case it may help:

using GLM, DataFrames, StatsPlots; gr(dpi=600)

# automatically digitized with Chrome's WebPlotDigitizer app:
m = [ 0.62  2.25e-9;
      0.64  2.43e-9;
      0.66  2.65e-9;
      0.74  2.87e-9;
      0.77  2.08e-9;
      0.78  3.08e-9;
      0.81  3.25e-9;
      0.88  3.38e-9;
      0.96  3.45e-9;
      0.99  3.49e-9

DF = DataFrame(Sales = m[:,1], Δ = m[:,2])

Plots.scatter(DF[!, :Sales], DF[!, :Δ], 
	label="Input data", xlab="Sales",
	xlim=(0.6, 1.01),
	lw = 2, lc = :red,

pfit = lm(@formula(Δ ~ 1 + Sales), DF)
a, b = round.(coef(pfit), sigdigits=3)

plot!(DF.Sales, predict(pfit), lc=:cyan, lw=3, ls=:dot, label="Cases Δ = $a + $b * Sales")

1 Like

@rafael.guerra Thank you for this Rafael – it
inadvertently answered a question that has
been lingering on my mind.

I think the reason why the curve/regression is
straight is because of the scale. I wanted to
produce a logistic or polynomial curve to better
fit the series.

@YummyPampers2, you have a significant outlier in your data set, which raises the question of the accuracy of the other points. If their error bars are large and there is no a priori model for the data, could the straight line be an honest fit? If you have the data uncertainties, a weighted regression could be performed.

1 Like

Wow! Nice!

@YummyPampers2 , your dataset is very small: I would not recommend non-linear regression for such a small sample, unless you have very good reasons. You may want to check the docs for something along the lines of logit = glm(@formula(Δ ~ 1 + Sales), DF, Binomial(), LogitLink()), but do read up on this carefully beforehand to make sure you want to do that, e.g. Stock and Watson pages \sim 328.


You may want to play with the excelent Polynomials.jl package.

By removing the outlier you may get a decent data fit, but polynomials can be in general dangerous for data extrapolation, specially the high-order ones:

using Polynomials
(; Sales, Δ) = DF;      # Julia 1.7 destructuring
deleteat!(Sales, 5)     # this and next line will delete DF row!
deleteat!(Δ, 5)
p =, Δ, 2)
scatter(Sales, Δ, label="Input without oultier", legend=:topleft)
plot!(Sales, p.(Sales), lc=:green, lw=2, ls=:dash, label="2nd order Polynomial")


@rafael.guerra – thank you for this. Would you recommend
using OutlierDetection.jl or something similar (i.e.,
LinRegOutliers.jl) to remove the outlier.

When I apply your instructions, I am generating

I think the original outlier is at index 10.

Testing now, any suggestions short of
calculating Cook’s Distance and performing
some quartile exclusion?


Take a look at the RAFF.jl package. which looks interesting for this problem, as it allows to fit a nonlinear model in a robust way in the presence of outliers.


@rafael.guerra Excellent!

Thank you for sharing the
knowledge – will test out
some use cases and

Best regards,