Future of sparse matrices in base / stdlib

I am not a fan of moving so many things out of base, and even less of a fan of moving many things out of stdlib. With julia 0.6 and before, having so many things in base made it a very user friendly experience. Many times, you could easily find what you were looking for by just guessing an appropriate function name and reading it’s docs. Now with so much functionality split up into separate packages, you would end up having to google for every little function and hoping there is some sketchy library that provides that functionality. It feels like it would be pretty disorganized frankly.

This is not to mention that loading up stdlib libraries will take forever because they are not part of the sysimg. I get that the purpose is to make things more developer friendly but the end result I feel is a less user friendly experience.

1 Like

I definitely agree that discover-ability is a problem right now, but, I would argue, it is a separate one.

Again, this is a separate problem. Hopefully there we be tangible progress on this front before anything gets moved out of stdlib.

3 Likes

But things have already been moved out of stdlib in 0.7. Look at eigs, std, and others.

2 Likes

Just because code is moved out of the JuliaLang repo, it does not become sketchy. An argument can be made for the opposite, as the fact that the code is moved out makes it so much easier to contribute to that code quality and feature level is likely to increase. The searching and discovering of standard functions is easily covered by meta packages for each application area.

2 Likes

This is, unfortunately, not a shared opinion. If it were, the whole discussion would be moot.

3 Likes

They are moving basic statistical stuff back into Base. I definitely agree that they should not have taken the LinearAlgebra stuff out either. That’s not the same as sparse matrices though, you have to draw the line somewhere.

So anything not in the JuliaLang repo is “sketchy”? No offense, but that sounds crazy to me, open source software would never survive if everybody thought this way.

1 Like

Can someone explain the concept and process of ‘vetting’? It seems very unclear to me.

As an open source project, anyone can contribute to any part of the language, stdlib, Base, core, and of course packages. Who, exactly, is it that must be vetted for stdlib to considered acceptable?

For people working in security focused area like in the military or military contractors, code needs to be “vetted” before they can be used in these projects. It’s probably much easier to do a check on a single project than multiple projects that are scattered all over the internet. The more that useful code can be found in a single reliable place, the easier it is for a lot of people.

Sure, but how does it work? Especially in open source, where contributors can be anonymous?

If anonymous has privileges she could use it…

Anyone can contribute, but most high-quality contributions are from people deeply embedded in Julia as an institution: people familiar with the infrastructure, best practices, intended directions, and norms. For all the talk of decentralization and federalization, the same names come up again and again when you check the commit logs of key packages that cross domains. There’s a trust system that develops from this familiarity, and it’s within the system that almost all useful work gets done.

In some cases, that system is explicit: there are primary gatekeepers (https://github.com/orgs/JuliaLang/people) who guard the language & stdlib. The various domain-specific organizations serve a similar purpose, although the boundaries are often less clearly-drawn.

When we evaluate packages for suitability, we use membership in those trust organizations (as well as proxies like Github stars, documentation, & commit activity) as heuristics for the quality of the underlying code. I think there’s a lot of value in making that trust explicit & providing a set of officially-blessed packages. JuliaPro contains that set, more or less, but that blessing isn’t visible when a user’s looking through packages on Github. Maybe it should be?

3 Likes

I believe what they do is download a certian version of the code, vet it, and then store it internally. They do that with python and its main packages like scipy and its many dependencies, so it’s not like it’s impossible if there are lots of packages. But I’m sure it’s not easy to get people to vet a large number of small libraries from github compared to vetting a few big ones.

It’s also annoying from a user experience point of view when I want to do something very simple and suddenly I have to call using Blah and then wait five minutes for the library to load for every little thing.

Another thing I would like to say is that potentially having a new sparse matrix datastructure does not mean that the current SparseMatrixCSC has to be removed from base (or especially stdlib). Julia’s extensibility through multiple dispatch means that it should be trivial to have a new sparse matrix datastructure that can SparseMatrixCSC without having to remove SparseMatrixCSC entirely from base or stdlib.

Not just code, but the authors of the code themselves.
I wish all of this rampant nationalism would just disappear, but unfortunately, that’s the world we live in.
I’d like code to be “vetted” quite thoroughly, preferably by people who are knowledgeable or expert in that particular field, and then have it added to trusted registries, but I could care less about where the author(s) came from.