To anyone who is considering writing a reply to this already long thread, please be considerate of your fellow colleagues in the forum and try to focus on the original post
Let’s use our limited resources to improve the status quo instead of fighting with each other
Using Julia for standard ML stuff like images and NLP is a stretch if prepackaged algorithms exist for python frameworks. But using AD packages for scientific machine learning and the million other algorithms that require gradients (eg, bayesian estimation and probabilistic programming) with flexibility in creating the computational graph is where Julia should have a sweet spot. Hopefully significant investment in AD can make that a reality.
Shout out to the those doing zygote maintainenance the best they can (@ToucheSir and @mcabbott and @devmotion and many others), and acknowledging the thankless task to maintain something they didn’t design, with perplexing interactions with the compiler, and grumpy users.
For me, julia is just yet another glue language. Calling different package like python, and avoiding the terrible experience like c++.
The only reason I move all my codes (scientific computing) from python is the @threads macro and @distributed macro.
When talking about autodiff, large and complex packages like Zygote are typically mentioned, together with their drawbacks and bugs. But it is often useful to differentiate a low-dimensional function, and here ForwardDiff shines! The function can be written without thinking much of potential differentiation, can use custom structs and call other functions - ForwardDiff works just fine. For me this is the main/only autodiff usecase, and I never encountered bugs causing wrong results on this path.
I’ve come to Julia after having previously worked within the SciPy ecosystem, so I’d say there are two forks here:
Some of the stuff that Yuri identified, like the aliasing issues of sum! is really bad. This is the kind of thing that the SciPy team really hammered out early on, and you never had to worry about the correctness of the output.
I give the Julia developers a bit of a pass on other stuff, like the issues with Zygote.jl, which is really beyond “core” Julia. But automatic differentiation is a generically important tool, and Julia should have it.
This leads to a gripe I’ve had since the beginning, which is that by ceding the direction of the ecosystem to the community, you get buggy and redundant packages. As noted elsewhere, ForwardDiff.jl works for a lot of smaller scale stuff, but is inadequate for the ML stuff that prompted Zygote.jl. Right now, we’ve got people developing for both. But would it be better to have everyone work together to either extend ForwradDiff.jl to be suitable for ML or should all the efforts be on getting Zygote.jl fixed?
I know there are arguments in favor of having this kind of massive community driven flexible ecosystem, but I really think it ends up creating confusion and disappointment with redundant and/or incomplete packages. A good example of this is interpolation; there have got to be at least a half a dozen different tools for 1D/2D interpolation. This is the sort of stuff that SciPy resolved, with the main developers slowly adding in “official” routines to the main package over time.
Just wanted to chime in on this specific example of package “redundancy”.
If I’m not mistaken, FowardDiff.jl will never be the best choice for large-scale deep learning because it implements forward-mode autodiff, whereas Zygote.jl (or Enzyme.jl) implements reverse-mode autodiff. These two methods have very different mathematical properties, and that’s why we need both.
However, I see your more general point, and it is true that in areas like AD, Julia doesn’t yet have a “one package to rule them all” approach. But that’s coherent with the language’s philosophy of dispatching on the tool that works best. For instance, if you are working on small neural networks, SimpleChains.jl seems to give impressive speedups compared to the default choice of Flux.jl.
Sure, it gives a little more work to the package user who has to look around and choose, but I’d argue that
the gain in performance can be worth the cost
interface packages such as ChainRulesCore.jl or AbstractDifferentiation.jl make it increasingly easy to switch between packages without altering your code (shoutout to their devs!)
Yeah, that’s a good point about backwards vs. forwards differentiation; of course there’s ReverseDiff.jl too.
I’d add that if the lead developers hope that Julia ends up with an R style ecosystem, that’s bad. I know R is successful (and popular) for what it does, but its package ecosystem is a total cacophony of interfaces and data structures (i.e. arrays vs. matrices vs. data frames) that even simple tasks that involve mixing packages developed by two different groups becomes a project in itself.
Regarding your two bulleted points, I would remark that I’d take a modest performance hit (say 10%) if it meant my development time was cut because there was less ambiguity about which package to use for what, better interoperability, and logically consistent interfaces/data structures. The interface packages and abstractions are great, if they work.
This reminds me a bit of the Trillinos vs. PETSc competition for high performance numerical linear algebra. Trillinos was probably a bit faster, at least at some tasks, but it was a software project that was a collection of independently and often times inconsistently developed packages that required auxiliary packages for clean interoperability. In contrast, PETSc had a much more unified development plan, so things tended to just work, in a consistent way, without too much effort on the user end.
These large packages/ecosystems like SciPy take time to develop, but I think we are already starting to see them emerge in Julia, despite its young age. DifferentialEquations.jl, Queryverse and MLJ spring to mind. I think in general, even without formal interfaces, Julia presents less friction for these ecosystems to develop than Python, and we will see many more develop and mature in the near future.
I think in general, even without formal interfaces, Julia presents less friction for these ecosystems to develop than Python, and we will see many more develop and mature in the near future.
I think you’re right, but now I’m curious about the plan for formal interfaces going forward. It seems like a hard or at least a new problem, given Julia’s unreasonably effective multiple dispatch. Is there any work on this or has it stalled?
Those are good books but I don’t think the first-order issue is ignorance. The key here is your use of the word “professional” i.e., someone paying you to do something for a living. Many of the maintainers are doing this in support of their research or even just to contribute to the community. We should be grateful for the help they give, even if those contributing less on maintenance may sometimes try to respectfully nudge things towards coordination of limited resources on essential packages.
I suppose you’d have to say the same thing about Python, since the core Python team also leaves the development of scientific computing libraries to the community. Furthermore, because of how Python is designed, different package ecosystems all end up rolling their own arrays from scratch (NumPy, Theano, TensorFlow, PyTorch, JAX, etc.). So Python also has quite a bit of fragmentation. One big difference is that the Python community is so enormous that there are plenty of users and developers to go around.
Indeed, Python suffered, and continues to suffer, many of same issues. But,
Python started as a general purpose language, while Julia, at least as I understood it, was always targeted at scientific computing and related disciplines, with the intention of displacing MATLAB along with the SciPy suite.
I agree that this all may clear up in time, but as you note, Python now has NumPy/SciPy. So all that core functionality that comes to mind when you think of MATLAB (arrays, integration, root finding, interpolation, etc.) is located inside of SciPy. More challenging things, like ML, which lack some of the unified theory of older numerical algorithms, are still being worked out. My gripe is that Julia doesn’t, yet, have the full complement to SciPy as part of, if not an official set of packages, at least a well recognized suite that tackles all of the classical MATLAB stuff out of, or nearly out of, the box, with common function call styles and consistent use of data structures.
Yes, but that’s fundamentally the problem. Going off the TIOBE index, Python is about 21 times more popular than Julia, but we split and create packages like we have a community the size of Python’s.
There are 3 popular ways of doing automatic differentiation for machine learning in Python – TensorFlow, PyTorch, and JAX. (Theano has been abandoned). Of course, most of the Julia community are academics or in ML. So let’s say Julia users are 7 or so times more likely to work on autodiff-related projects. So if Julia were equally fragmented, we’d expect to have 3*7/21 = 1 major framework for automatic differentiation. Instead, we have at least 10 that I found without much work:
Enzyme
ReverseDiff
Diffractor
Nabla
Zygote
Autograd
Yota
ReversePropagation
ForwardDiff
Tracker
In other words, we’re spending about 10% of the resources per package that Python is. This pretty much replicates across domains; I figure we have about 4 or so PPLs in Julia (Gen, Turing, Soss, ForneyLab) with a much smaller community, while Python has 2 big ones (PyMC3 and Pyro).
So it seems these packages have to be spending less time and effort on bug-hunting and fixes, like the post argues. Where I disagree is I don’t think this is because Julia coders don’t care whether their code has bugs; everyone cares about that. I think it’s the fact that no single package has the same level of resources being dedicated to it as JAX or PyTorch. It’s the Lisp curse making us go from “Hey, I could build something like that!” to “Hey, I should build something like that!” It’s an academic mindset that says a package is something one person builds by themselves, and then they’re done. And it’s an extreme level of perfectionism in vision where any design choice we disagree with has to be remedied with a new package, instead of prioritizing improvements for whatever already exists and is good enough for most users right now.
Your claims here might be an accurate diagnosis of problems in the community, but I worry you’re conditioning in a distorting way by comparing “popular” packages in the two languages – are there really fewer libraries in Python or is the threshold for popularity in Python high enough that it would kill off most of the Julia options you’re comparing against? Did you actually exhaust the comparable Python options?
Just to give a sense by comparing GitHub stars of popularity:
I totally disagree with the view that the flourish of similar packages is bad. Julia is relatively young. We will eventually benefit from all these experimental efforts. Great packages emerge during the process. I still remember how Flask strikes me when there were already several major frameworks presented in the Python community.