Adding a trendline or line of best fit

I’m quite surprised there are no posts about trendlines or lines of best fit on these forums. Maybe they are called something else? The only thing I’ve been able to find online is this:

using LinearAlgebra
dates = 1:150
ticks = 1:12:150
ticks_labels = 0:12
my_values = rand(150).+dates*0.01
plot(dates, my_values, xticks = (ticks, ticks_labels), label="my series")
bhat = [dates ones(150)]\my_values
Plots.abline!(bhat..., label = "trendline")

image

I did a lot of plotting in Excel and am wondering how to or what the Julia version of a trendline/line of best fit is. In Excel, the options for trendlines are:

image

And are explained here as

  • Linear: A straight line used to show a steady rate of increase or decrease in values.
  • Exponential: This trendline visualizes an increase or decrease in values at an increasingly higher rate. The line is more curved than a linear trendline.
  • Logarithmic: This type is best used when the data increases or decreases quickly, and then levels out.
  • Moving Average: To smooth out the fluctuations in your data and show a trend more clearly, use this type of trendline. It uses a specified number of data points (two is the default), averages them, and then uses this value as a point in the trendline.

I tried making my own line of best fit, but it’s actually just the average of the points.

using DataFrame, StatsPlots
asdf = DataFrame(bb = 10 .* rand(6),
cc = 10 .* rand(6),
dd = 10 .* rand(6),
ee = 10 .* rand(6),
ff = 10 .* rand(6),
gg = 10 .* rand(6))

average_point = [mean(asdf[1,:]),mean(asdf[2,:]),mean(asdf[3,:]),mean(asdf[4,:]),mean(asdf[5,:]),mean(asdf[6,:])]

plot(asdf[!,:bb])
plot!(asdf[!,:cc])
plot!(asdf[!,:dd])
plot!(asdf[!,:ee])
plot!(asdf[!,:ff])
plot!(asdf[!,:gg])
plot!(average_point, colour = :black, lw = 2.0, label = "average", leg = :outerright)

image

I am curious how other people view this, or any information I could have about the Julia equivalent.

yeah this is not the task for Plots.jl but it’s super easy to use the eco system:

1 Like

There are many Julia plotting packages, some of which support fitting models to data, such as

AlgebraOfGraphics.jl

VegaLite

1 Like

I would just add that this is a feature and not a bug. Julia is designed in such a way that allows for the composition of separate code/modules that are able to interact and play nicely with one another so it’s pretty typical to have small modules that have a narrow scope and then to simply load another fit-for-purpose module, as needed. In other words, a plotting package probably isn’t fit for the purpose of building models and should therefore only provide limited, if any, model-building capabilities.

As others have pointed out, there are lots of packages for fitting curves to data (I would add GLM.jl) and if they don’t already have plot recipes, it’s pretty straight forward to generate the data necessary to plot via Plots.jl.

In the case of plotting a regression line, I believe in Excel you can get the coefficients as well as the R^2 but I don’t think you can get anything else when using the plotting feature (that’s how it was when I used to use it for this). If you build a regression model with GLM, you’ll get a lot more useful information about your model than you would by just plotting the line (e.g. do you even want to plot a regression line if there is a high probability that the coefficients aren’t significantly different from zero?)

2 Likes

There actually are a few posts on this if you search the forum. This one could be helpful: Plot the confidence interval for a model fit

In short you can plot the output of predict from the GLM package for a simple linear fit, or of course a logarithmic fit if you parametrize the model appropriately.

Moving averages are available from the RollingFunctions package.

I’m not normally one to present missing functionality as a “feature”, but this pre-packaged application of one-click trendlines has gotten people in trouble before, so there is something to be said for Julia requiring you to explicitly specify some form of model for your trendline:

3 Likes

Hi @HelgavonLichtenstein,
to plot a regression line, with Plots you can use the keyword argument smooth=true.
In your example:

plot(dates, my_values, xticks = (ticks, ticks_labels), label="my series", smooth=true)

Note that AlgebraOfGraphics at the moment has only linear (linear regression) and smooth (LOESS regression, not sure what it corresponds to in Excel terminology), and lacks the other “trendline options”. On the flip side, the design is such that extra analyses can be added easily.

I suspect that instead of hardcoding a few (IMO arbitrary) choices, there should probably be a way for the user to pass a model of the data. Then AlgebraOfGraphics would do a least squared error fit and plot the result.

think there’s a plot recipe for this in my package: https://github.com/caseykneale/ChemometricsTools.jl/blob/d8cd288ae76b221274a54cc204cd146791bddf98/src/PlottingTools.jl#L7

Haven’t looked at it in ages though