Using Python as glue code and through pyjulia is fine. If all of your time is spent in package calls then there’s no speed difference. The issue is developing packages in Python. It’s not just a code acceleration thing, the whole package development ecosystem is kind of wonky with tox environments etc. diffeqpy had the help of some experienced Python package devs and still took about a week to get tests running, so that’s what it’s like. Julia (and R) on the other hand have very easy integrations with CI and code coverage libraries, along with people actually double checking package submissions to meet standards and helps you through fixing issues (no one checks PyPi, anything goes there).
But no reason to not use Python to use other people’s packages since then your code will just be as fast as the package code if all your time is in package code calls. With one big exception: it will run into performance issues with any package that requires function inputs, like differential equation, optimization, etc. libraries. In a recent blog post we showed that Numba+SciPy integrators are still 10x slower than it should be, so it’s not the easy solution some people make it out to be. The reason of course is that Numba doesn’t do interprocedural optimizations and cannot optimize through Python code, so if there’s any Python in the entire stack then the whole thing can get more context switching costs, so even if you have a faster function call you still have a limiting factor. If you have an asymtopically costly function then this could be mitigated, but that means you do have a lot of limitations on standard use cases.
This is why some Python optimization and differential equation libraries actually take in strings or SymPy expressions input (PyDSTool is one of many examples), since direct Numba/Cython usage for function input will have these problems so they have to write a compiler behind the scenes, which then of course means you have the restrictions that it has to be Float64 without arbitrary numbers and etc. etc. etc. You can see how this not only lowers features but also adds to development complexity. In fact, this is clearly seen by how PyTorch has a full JIT compiler inside of it along with a full math library! So developing PyTorch was more like developing Julia and the ML library at the same time, instead of just building an ML library. And then it doesn’t get the automatic composibility features of Flux.jl so you have to build everything in there yourself… you can see how the package development differences are quite astronomical.
Back to Packages
So okay, that rant was done because it’s informative. That’s where are competitive difference is and it’s reflected in the state-of-the-artness of packages. A lot of data science, statistics, and ML libraries just take a matrix of data in and spit out predictions. In that sense, all of the time is spent in package code and while Julia does have a competitive advantage for building such packages, it doesn’t have an advantage for using such packages. This is why data science and ML have done just fine in Python, and statistics has done just fine in R. It would be hard to find real Julia gems in these areas except in things which are still brand new due to someone’s recent research. I’d point to Josh Day’s OnlineStats.jl as an example of that.
One exception is with Bayesian estimation and MCMC. For this area of data science you do need to take in full models. R and Python have resorted to taking Stan models in the form of a string. This is ugly! You’re writing in another language (Stan) and in a string (without syntax highlighting) in order to do Bayesian estimation? Julia has libraries that I have found to be amazing, like Turing.jl and DynamicHMC.jl. These libraries let you insert your likelihood function as just standard Julia code, making it much easier to make complicated models. For example, Stan only lets you use its two differential equation solvers (with like 5 options ) and only for ODEs, but Turing.jl and DynamicHMC.jl both can internally utilize DifferentialEquations.jl code for ODEs, SDEs, DAEs, DDEs, etc.
And neural net libraries are not as blackbox as most of ML. To get around the issues I mentioned earlier, you cannot just pass functions in so you have to tell the neural net library how to build the entire neural net. This is why PyTorch and Tensorflow are essentially languages embedded into Python which produce objects that represent the functions you want to write, which pass the functions to these packages to compile the right code. Obviously this would be hard to write on your own, but the main disadvantage to users is that normal user code doesn’t work inside of these packages (you cannot just call to arbitrary SciPy code for example). But Flux.jl has full composability with Julia code so it’s definitely state-of-the-art and has the flexibility that other neural net libraries are trying to build towards (Tensorflow is rebuilding into Swift in order to get this feature of Flux.jl…)
But where things tend to be wide open is scientific computing. This domain has to deal with functions and abstract types. So I’d point to not just DifferentialEquations.jl and JuMP, but also a lot of the tooling around that area. IterativeSolvers.jl is a great example of something heavily unique. At face value, IterativeSolvers.jl is a lot like Pysparse’s iterative solvers module or the part of SciPy. Except it’s not. These both require a NumPy matrix. They specifically require a dense/sparse matrix in each documented function. However, one of the big uses of these algorithms is not supposed to be on sparse matrices but on matrix-free representations of A*x
, but you cannot do this with these Python packages because they require a real matrix. In IterativeSolvers you just pass a type which has A_mul_B!
(*
) overloaded to be the function call you want and you have a matrix-free iterative solver algorithm. Complex numbers, arbtirary precision, etc. are all supported here which is important for ill-conditioned and quantum uses. Using abstract types you can also presumably throw GPUArrays in there and have it just run. This library only keeps getting better.
One of the best scientific computing packages is BandedMatrices.jl. It lets you directly define banded matrices and then overloads things like *
and \
to use the right parts of BLAS. Compared to just using a sparse matrix (the standard MATLAB/Python way), this is SCREAMING FAST (the QR factorization difference is pretty big). Banded matrices show up everywhere in scientific computing (pretty much every PDE has a banded matrix operator). If it’s not a banded matrix, then you probably have a block-banded matrix, and thank god that @dlfivefifty also wrote BlockBandedMatrices.jl which will utilize the same speed tricks but expanded to block banded matrices (so, discretizations of systems of PDEs).
In fact, I would say that @dlfivefifty is an unsung hero in the backend of Julia’s scientific computing ecosystem. ApproxFun.jl is another example of something truly unique. While MATLAB has chebfun, ApproxFun is similar but creates lazy matrix-free operators (so you can send these linear operators directly to things like, you guessed it, IterativeSolvers.jl). It uses InfiniteMatrix (InfiniteArrays.jl) types to grow and do whatever sized calculation you need without full reconstructions of matrices.
While those may look like niche topics if you don’t know about those, those are the types of libraries which people need in order to build packages which need fast linear algebra, which is essentially anything data science or scientific computing. So Julia’s competitive advantage in package building is compounded by the fact that there’s now great unique fundamental numerical tools!
Along the lines of these “fundamental tools” packages is Distributions.jl. It’s quite a boring package: it’s just a bunch of distributions with overloads, but I think Josh Day’s discussion of how you can write a code which is independent of distributions is truly noteworthy.
If you want to write code to compute quantiles in R or Python, you’d need to use a different function for each possible distribution, or take in a function that computes the pdf (which of course then gives you the problem of function input in these languages!). Package developers thus far have just hard coded the distributions they need into things like SciPy, but it’s clear what the scalability of that is.
And then I’d end with JuMP, DifferentialEquations.jl, and LightGraphs.jl as very user-facing scientific computing packages which are in very important domains but don’t really have a comparison library in another scripting language. JuMP’s advantage is that it allows you to control many things like branch-and-cut algorithms directly from JuMP, something which isn’t possible with “alternatives” like Pyomo, so if you’re deep into a difficult problem it’s really the only choice. DifferentialEquations.jl is one of the few libraries with high order symplectic integrators, one of the few libraries with integrators for stiff delay differential equations, one of the two high order adaptive stochastic differential equation libraries (with the other being a re-implementation of my first paper on it), one of the only libraries with exponential integrators, one of the only libraries with … I can keep going, and of course it does well in first-order ODE benchmarks. So if you’re serious about solving differential equation based models then in many cases it’s hard to even find an appropriate alternative. And LightGraphs is in a truly different level performance wise than what came before, and it only keeps getting better. Using things like MetaGraphs lets you throw metadata in there as well. Its abstract structure lets you write and use graph algorithms independent of the graph implementation, and lets you swap implementations depending on what would be efficient for your application.
So if you’re serious about technical computing, these are three libraries to take seriously.
These packages have allowed for many domain specific tools to be built. Physics is one domain which is doing really well in Julia, with DynamicalSystems.jl and QuantumOptics.jl being real top notch packages (I don’t know of a DynamicalSystems.jl comparison in any other language, and I don’t know anything that does all of the quantum stochasticity etc. of QO).
There’s still more to come of course. Compilation tools like PackageCompiler.jl and targetting asm.js is only in its infancy. JuliaDB is JuliaDB is shaping up to be a more flexible version of dask, and it’s almost there. We’re missing good parallel linear algebra libraries llike a native PETSc, but of course that’s the kind of thing that can never be written in Python (Julia is competing against C++/Fortran/Chapel here).
But I hope this is more of a “lead a to water” kind of post. There are a lot of unique things in Julia, and I hope this explains why certain domains have been able to pull much further ahead than others. Most of the technical computing population is still doing data science on Float64 arrays with blackboxed algorithms (TSne, XGBoost, GLM, etc.) and in that drives a lot of the “immature” talk because what they see in Julia is similar or just a recreation of what exists elsewhere. But in more numerical analysis and scientific computing domains where higher precision matters (“condition number” is common knowledge), functions are input, and there are tons of unique algorithms in the literature that still have no implementation, I would even say that Julia surpassed MATLAB/Python/R awhile ago since it’s hard to find comparable packages to ours in these other languages.
If we want to capture the last flag, we really need to figure out what kind of competitive advantage Julia has in the area of data science / ML package development, because until it’s unique it’ll just be “recreations” which don’t draw the biggest user audience. While “programmers” will like the purity of one language and the ability to develop their own code, I think the completeness of statistical R or ML Python matters more to this group. I think a focus on big data via streaming algorithms + out-of-core is a good strategy, but it’s not something that cannot necessarily be done in C++ (dask), so there’s something more that’s required here.