Simple linear regresion in Julia? (and a comment on the stability of user interfaces)

I have two arrays of the same dimension, and I want to do linear regression
(least squares) between them.

After searching in google, It seems that Julia used to have a very useful simple function linreg, for doing that

But it seems that it has been removed in 1.1 (or moved to some package?)
How can I replace it?
Please help me!

Just one comment: a thing that I really don’t like about Julia is that everything got deprecated every time. This makes impossible for new newcomers to pick a example somewhere on the Internet and just try it. It almost sure that it won’t work.

I really cannot recommend a program that changes all the time its interfaces to my students, for that reason. Also in this way, you won’t get useful documentation (like books or courses) from third parties.
I strongly recommend you to keep the user interfaces as stable as possible.

2 Likes

Yes, it looks like linreg got lost in the transition from 0.6 to 0.7. Fortunately, it appears to be very easy to define that function yourself (see the suggestion here: Has `linreg` moved here? · Issue #398 · JuliaStats/StatsBase.jl · GitHub ), so hopefully this won’t impact your work significantly.

For context, out of all of the (dozens and dozens) of things I’ve updated from Julia 0.6 to 0.7 and 1.0, this is the first time I’ve encountered a broken deprecation like this, so the issue isn’t nearly as widespread as you might think from this one example.

I understand that this is frustrating for you. It has taken a lot of work to upgrade almost the entire Julia package ecosystem to Julia 1.0.

But it is hyperbolic and inaccurate to say that “everything got deprecated” and it’s “impossible for new newcomers” and “it’s almost sure that it won’t work” and “you won’t get useful documentation”. That simply isn’t true, and that kind of hyperbole doesn’t make a positive contribution to this discussion. It only serves to frustrate everyone who has already done so much work to upgrade Julia’s packages, documentation, and ecosystem. I’m happy to point you to the wealth of tutorials, examples, books, and courses which are all compatible with Julia 1.0 if you’re interested.

It’s true that a lot has changed on the road to Julia 1.0. That’s to be expected: if the interface had been good enough back in Julia 0.5, then we would have called that 1.0 and been done with it. Nobody knows exactly what a language should look like the first time around, and it has taken several years to refine Julia into something that is good enough to be a stable 1.0.

And, on that note, stability is exactly what Julia reaching 1.0 has meant. We’ve now had versions 1.0.1, 1.0.2, 1.0.3, and 1.1, with 1.2 on the way, all of them with no breaking changes. That means that any example written since Julia 1.0 (i.e. this summer) will be compatible with any current or future 1.x version.

4 Likes

I like the idea of moving statistics functions to packages. I only have one question, if I write a paper using more than 10 Julia packages, do I have to cite all of them? If that is case, my reference will look weird.

I apologize if you feel that my comment was not a positive contribution to this discussion, but I wanted to express my sincere feeling of frustration caused by the continuous deprecation of things.
I didn’t mean to disregard in any way the hard work of the people working in Julia.

In my opinion, one possible solution would be to provide some “backward compatibility” package for the functions that got deprecated but can be easily written in terms of others (like this example linreg, or linspace that has been deprecated for range).

In this particular case, even though it seems I can define it by

linreg(x, y) = hcat(fill!(similar(x), 1), x) \ y

I see no be benefit in simple removing that You see that for a newcomer student it is no obvious how to define it ! Why don’t just put it into some package like Statistics ?

Oh! no! You see it, I take the line defining linreg from the forum discussion and I got an error! Perhaps it used to work in a previous version of Julia?

linreg(x, y) = hcat(fill!(similar(x), 1), x) \ ylinreg(x, y) = hcat(fill!(similar(x), 1), x) \ y
ERROR: syntax: “hcat(fill!(similar(x), 1), x)” is not a valid function argument name

Yes, that seems completely reasonable. There’s some discussion on that front here: Has `linreg` moved here? · Issue #398 · JuliaStats/StatsBase.jl · GitHub and here: linreg deprecation warning incorrectly points to StatsBase.jl · Issue #28688 · JuliaLang/julia · GitHub

The fact that there is no working deprecation for that function is indeed a bug, and a fixable one. This is likely somehting that one motivated person (like you!) could make happen. That is ultimately the only way that open-source software moves forward.

1 Like

You’ve copied and pasted the code twice on one line.

2 Likes

Many thanks! I see ! I will try to report the bug in the Statistics package.

Just to reiterate, the whole point of the 1.0 and beyond release is to stop deprecating things. Stability wasn’t promised for the previous releases.

You would probably be better off using a proper regression package anyways, see Examples · GLM . That way you get standard errors and more flexibility.

5 Likes

The last time I needed a quick and easy bare-bones solution to linear regression in Julia as well, I finally put it together as a package (with minimal dependencies) for Julia 1+, you can find it here: https://github.com/st--/LinearRegression.jl - and in the README I also created a list of various other packages that provide fancier solutions (e.g. GLM.jl for DataFrames, fitting statistics, etc.; and others for ridge regression, sparse regression, Bayesian linear regression and so on…), in the hope that it’s useful to others too.

1 Like