Discussion on "Why I no longer recommend Julia" by Yuri Vishnevsky

Julia‘s high productivity may have lead to this. Because packages are sometimes not easily discoverable or well-documented, for many tasks discovering and learning someone else‘s package may take longer than writing one from scratch. We explore the space of possibilities quite well because of this. The flipside is that we get multiple similar but incompatible efforts and individual packages sometimes don‘t get battle-tested enough.

I feel as though we have reached a stage where some consolidation is happening both through unified interfaces and filling feature gaps.

Many core components of the Julia ecosystem for data science or statistics are already top-notch and sometimes even unmatched.

For AD, my ride has been bumpier than I had hoped and due to my lack of familiarity with the domain I don‘t see a clear path forward. In my experience we are quite far from being able to just differentiate most things we would expect to be able to. Many times fairly simple computations don‘t AD and it‘s incredibly time-consuming to track down why they don‘t, and even when they work, any time you update your dependencies you could be breaking AD for your application. Some of that is probably due to the inherent difficulty of what the AD packages are trying to accomplish, but many times I feel that it‘s just that the code path has not bern sampled even though it‘s not far-fetched.

6 Likes

This is spot on. If we want correctness and conformity in the ecosystem we should achieve it by making it easy to write packages that are correct and compatible as the path of least resistance, not by controlling what packages people write or work on. Developer tooling, interface tests, base packages that unify functionality in a consistent way while easing developer workload.

We need more carrots, not more sticks.

27 Likes

One thing that I think gets lost in all this discussion when mentioning Python libraries like Tensorflow, Pytorch, and Jax is that they are, as far as I am aware, essentially being developed by dedicated teams at some of the largest software companies in the world. Their developers are paid to work together on a unified product. That’s their job.

There is no equivalent in the Julia world, i.e. companies at the scale of Meta/Google who can afford to pay large teams of developers to work full time on open source libraries. Many (most?) of Julia’s developers are not employed to permanently work on a few specific Julia libraries, but are instead working on what interests them, or what they need to facilitate a project they are working on as part of their job. Asking them to reallocate their time away from what they find fun to do as a hobby, or what they need for their own job, is a hard sell.

20 Likes

Between Relational AI, Pumas, Beacon Biosignals, Julia Lab, JuliaComputing (and others) we are starting to get a group of developers paid to use Julia full time. We’re nowhere near Google/Microsoft scale, but we are starting to get more funded dev time.

29 Likes

PyPI has 49x more, with twenty times the userbase and three times the age. “Python is older” doesn’t explain it, but “Python is older and much bigger” does.

1 Like

I broadly agree, although I suspect that the dev base may 20x but the user base is more like 100x.

Well, but it is so much more fun to write Julia packages than Python packages, so this ratio is changing…

20 Likes

As it turns out julia is a Lisp after all, and so it suffers from it’s curse, that curse is also a blessing because I don’t know of another language where one can be so expressive and also write performant code at the same time.
On python there is a massive barrier between users and developers while in julia it seems that most users are an idea away of being developers.

24 Likes

14 posts were split to a new topic: Towards the creation of an interface testing package

I don’t think Julia is more dangerous than the early-stage JavaScript…
JavaScript has its advantage, then people who wants better software call for TypeScript.
Julia has its advantage, we might have something.

2 Likes

25 posts were split to a new topic: How do I know if a package is good?

A post was merged into an existing topic: How do I know if a package is good?

So is the picture that is emerging is that in Julia the SciML ecosystem will be the “equivalent” to Python’s Scipy? (i.e. the go to package for numerical computing, which includes most common problems)

If so, I see two fields which are still missing: eigenproblem, ffts. Are there plans to add those to the SciML ecosystem, or are those beyond scope?

(Perhaps this is off topic. Is soo, please fell free to move this to another thread)

2 Likes

Yes, here’s the unified documentation (which is still in-progress, so don’t post it as something that’s complete, we’ll do a release announcement when it is):

https://docs.sciml.ai/dev/

Indeed you can see it’s meant to be the “go-to” for numerical computing. Indeed FFTs are missing, but that’s because it already existed and is good in Julia. I mention them in the docs:

https://docs.sciml.ai/dev/highlevels/interfaces/#AbstractFFTs.jl:-High-Level-Shared-Interface-for-Fast-Fourier-Transformation-Libraries

https://docs.sciml.ai/dev/highlevels/numerical_utilities/#FFTW.jl:-Fastest-Fourier-Transformation-in-the-West

But I wonder if there’s a better way to highlight them. Eigenvalue problems, yes that should get something similar to LinearSolve.jl but we just haven’t gotten around to it.

The other thing is interpolations.

24 Likes

I added FFTW just directly to the docs since it’s just the repo. That shold make it more searchable. DataInterpolations too (not Interpolations.jl because of course that is rough to use). Eigenvalue problems will take a bit more to do properly.

8 Likes

I don’t have much to add to this excellent discussion. I just want to say that it is great that Yuri and others opened all those issues listed in the blog post. Issues are actionable, and at least give developers the opportunity to address them, even if it occasionally takes a long time because of underlying difficulties or time constraints.

30 Likes

I think that having a “one-stop-shop” for most common tasks in numerical computing is really import in order to advance the adoption of Julia. Having a unified SciML ecosystem will greatly improve this! So thank you for your efforts @ChrisRackauckas !

I have been facing a problem when trying to convince students to use Julia.

First of all, I think Julia is a fantastic language! I love the simple syntax (much simpler than python’s which becomes a mess as soon you start using array). Types + multiple dispatch just immediately made sense to me (while python objects are hermetic, arcane nonsense to me).

But All of the students I interact with already know python+numpy+scipy (and many also know about numba).

How can I convince them to use Julia?

Well for-loops are horribly slow in python, while in Julia they are as fast as they can be. So that must be it! Except that there is numba, which solves that problem. (actually, it were the limitations of numba 6 years ago that made me move to Julia, but it seems that numba has greatly improved since then).

What about multithreading? Numba now also supports multithreading.

GPU programming? Again Numba + CuPy.

Now all of these things can be achieved in Julia. But at present, there is too much friction.

In my area of research the most common tasks are: 1) solving eigenproblems, 2) solving linear problems, 3) quadrature, 4) sometimes a fft or 6) an interpolation. If a student has to tackle those in Python, the student just has to write:

import numpy, scipy, numba, matplotlib

and that’s it.

In Julia? The student has to import one package for each task and I have to educate them on the differences between packages.

Should the student use ARPACK.jl, ArnoldiMethod.jl or KrylovKit.jl? What if we want inner eigenvalues? Well ARPACK.jl has shift-and-inverse but ARPACK.jl is also very fragile and prone to throw cryptic error messages. So ArnoldiMethod.jl or KrylovKit.jl it is. Except those have no shift-and-inverse implemented. So the student as to role its own. So what was a very simple task (calculating a few eigenvalues) becomes a lesson in numerical linear algebra. While I am going through this, the student has probably lost interest in this wonderful language and already solved the problem in python.

Plotting? Should the student use Plots.jl, PyPlot.jl, Makie.jl, VegaLite.jl, Gadfly.jl? Again in python, the student already knows there is matplotlib.

All of this is to say, that I really think that having a default, unified package for most numerical tasks will greatly improve this aspect and I believe it will make convincing students to adopt Julia much easier.

(sorry if this was a bit ranty)

32 Likes

I agree 100%. I wrote something similar in this post, and we discussed the package ecosystem a bit over in that thread.

To summarize: I think the solution is that we, as instructors using Julia, select a few packages for the purpose of teaching undergraduates and standardize on those for education.

In my opinion, there is an easy part and a hard part to this. The easy part is putting up a “Julia for Education” website somewhere listing the package selections (and other tips for students). (My hope is that we could have a unified source that replaces the “beginning of class Julia quickstart” handouts we all write individually, which needlessly duplicates effort.) I would be happy to do this if there is sufficient interest.

The hard part is making sure those packages are easy to use and bug free. So, for example, it seems someone would have to write shift-and-inverse for one of the two packages you mentioned.

However, I think completing the easy part would help focus attention on what needs to be done for the hard part.

If you would like to discuss further, please let me know and I can create a new thread.

7 Likes

I agree, I think some standardisation in the teaching tools is very important. But I also think that is import to suggest packages that are not just good enough for education, but can also be used for research (just like numpy+scipy). Otherwise, we end up with a kind of “two packages problem”.

That is why I am really hopeful regarding the SciML ecosystem as it will solve the hard problem.

That is what I made for personal usage with a “common interface” for different eigensolvers. Probably not good enough for public usage though, so I never bothered to put it up on github. But apparently SciML will include something like that

and with better code quality than I would be capable of.

This discussion is a tangent to the original topic, so a new thread is probably a good idea.

2 Likes

Given this, I think the answer is probably “you don’t.” You either tell them to use Julia, in which case you should probably think really hard about your learning objectives when it comes to the code parts of your lessons, or you should stick to teaching the domain-specific stuff, and let them figure out the code stuff on their own.

When I say “think hard about the learning objectives,” I mean that you want to state explicitly (to yourself if nowhere else) what it is you want students to learn, then make you can assess that, and make sure you’re modeling it and giving plenty of practice and feedback.

For the most part, IMO, trying to convince students of things is a lost cause. You can share your excitement and hope some of it rubs off, but if you really want students to learn a thing, you need to explicitly teach it.

30 Likes