Future of sparse matrices in base / stdlib


#1

Having read through https://github.com/JuliaLang/julia/pull/27638, I am concerned that there is a push to move sparse matrices (SparseMatrixCSC) into a third-party library.

Is this the plan?


#2

Yes, and it will be glorious! :tada:


#3

Is that the official position? (This is important, as opinions differ as to its glory.)


#4

Why is it that moving sparse matrices to third-party has such a big impact on your organization?


#6

I’m all for moving things that incur large external dependencies for Julia to be moved out of base, and into stdlib (and sometimes, as in the case of SuiteSparse, which is GPLed code, totally out of the distribution), however,
I think some of the simple sparse matrix support (such as SparseMatrixCSC) should be left in stdlib, and not just because otherwise it creates major headaches for people such as Seth, whom I consider an important person in the Julia community.
In fact, I think some of the things in DataStructures.jl should be cleaned up and moved into stdlib).

Many companies have to be careful about exactly which packages can be used (both the code and the authors have to be vetted - possibly “extremely” :grinning:), but they are more willing to trust things that are considered part of the language or it’s standard library.
Hopefully, in the future, it will be able to simply choose at startup what you want out of stdlib (but still load it as fast as the monolithic way everything is loaded now, and then people might not push things totally out of stdlib.
At Dynactionize, our servers had no direct connection to the internet, we had our own local copy of all of the packages we used (and Julia itself, compiled with no GPL libs), and a branch of METADATA.jl that only had the information about the packages that we used and had tested against.
Any security conscious company, like a healthcare system, would need to do something like that.

3 years ago I made my own branch of Julia, where I could use flags in Make.user to enable different features, such as BigInts, BigFloats, the REPL, the doc system, LinAlg, and more.
Running a script on a version of Julia without all of that extra overhead, especially on my Raspberry Pi, was quite a bit faster.

Most of those (with the exception of BigInts, BigFloats, and Regex support) have already been moved out to stdlib, so it’s really just a matter now of making things in stdlib optional.


#7

Is your reply in reference to some discussion elsewhere? (I am interested in this issue as well, as almost all I do depends on sparse arrays.)

Thanks for keeping track of what is going on with this. I missed the discussion trail about SuiteSparse linked to above…


#9

Is it relevant that they’re at Julia Computing, or is it relevant that they’re core contributors?

It would be quite unfortunate if people saw this discussion and thought (I mean, who know’s, given the comment it might be the case) that it’s possible to pay Julia Computing to significantly alter the direction of the project.

I know, there may be several reasons why you might not be able to share your position publicly, but it’s hard to persuade the crowds with talks behind closed doors.


#10

With respect, this is not my goal. My objective is to get an answer to my original question; that is:


#11

Right, I do see that you were not the one who brought up “why”, only “if”.


#12

It’s relevant because we’re core contributors and he wanted to make the full reasons for his concerns known to us. Seth doesn’t pay Julia Computing directly or indirectly.


#13

To expand on this: the reason I went to the folks I did was because they are core contributors with projects in similar environments and I wanted to get their opinions as to whether it was possible to work around the bureaucratic issues that accompany such environments.

As Stefan said, I have no direct or indirect business or other affiliation with Julia Computing.


#14

It looks unlikely that this will happen for 1.0. On what should happen for 2.0 or beyond, it will be a decision to revisit at that point.

The reason is that if we move sparse direct solvers into their own package outside stdlib (so that people who want other solvers can use those), some sparse linear algebra will fall back to AbstractArray implementations. There has also been some discussion around having more flexibility with sparse matrix implementations and having them in Base/stdlib makes it difficult for things to evolve.


#15

If it’s not in Base for 1.0 it can change in stdlib in a 1.x version. Base is the only thing that’s fully locked down by the 1.0 version. The stdlibs can be versioned at 2.0 before Julia itself gets to 2.0.


#16

As a user of sparse matrices I would not see any problem with moving them out of the stdlib into their own package. I suspect that might make them somewhat easier to contribute to. I think StaticArrays are a good precedent, they are probably at least as widely used in Julia but have never been in stdlib.


#17

I used to think that it was necessary for usability for more things to be in Base/stdlib. My own views have evolved, especially with BinaryBuilder and Pkg3 having made rapid strides. I do think that these things would actually see more movement if they were external packages.

In any case, I think this would be a major change too late in the cycle, and perhaps we needed a more consultative process, and more time to see how such changes feel with time to undo them if necessary. For now, we are leaving everything as is for 1.0.

Let’s revisit it going forward.

-viral


#18

Note that stdlibs can become normal packages in the future in 1.x versions of Juila. This is possible because project and manifest files explicitly include stdlibs just like normal packages, so if you instantiate a manifest with something that used to be a stdlib dependency but has become a normal package, it will still work. In other words, there is no urgency about moving this out of the stdlibs right now—it can be done at a future time when we’ve got better answers to how to keep it from being disruptive.


#19

Yes, please!


#20

Backward compatibility and guaranteed stability of code means less movements. What is better depends on what do you do.


#21

Backward compatibility and guaranteed stability of code means less movements. What is better > depends on what do you do.

Yes, API stability will bring less movement because of breakage. I should have been clear that the movement I was referring to was new features - more sparse datatypes, new kinds of solvers, etc.

-viral


#22

I know that, but the Julia Computing / JuliaLang confusion is a real thing, so I explicitly asked. I just came from a Computation in Economics and Finance conference last week, and I experienced a lot of critical remark on the fact that there is no “Julia Foundation”, but there is a company that employs the very core devs. I have always been supportive and wished JC the best of luck, but what I’m describing is actual remarks from actual conversations. So saying “I’ve discussed this important thing with JC in private, and I don’t want to discuss it.” Might not help the worries of those people.

With that said, I originally did not intend to derail this conversation even though I did, so pleas start a new thread or PM me if you think it’s really important that I hear your opinion :slight_smile: