I’m still trying to figure out how to articulate the “1.5” language problem. Here’s my attempt.
-
There is a gap between intuitive code that looks like the algorithm and performant code that will run efficiently.
-
Work still needs to be done to go from code that works to code that is deployable.
The beginning of the talk addresses a situation where a collaborator was using MATLAB and there was disappointment that a direct port did not quickly result in fast(er) code.
In fact it probably results in slower code. While the syntax looks similar there are some important differences in semantics. MATLAB uses copy-on-write semantics whereas Julia Arrays copy eagerly by default. MATLAB also highly encourages you to go down heavily optimized official code paths whereas Julia actively encourages use of third-party code with
The notion that a quick port would automatically be fast seems naive to me, although I can see how Julia catch phrases could give that impression. The naivety posits that MathWorks has been sleeping at the wheel and could optimize the execution of MATLAB code significantly. While MathWorks has at times seemed complacent or may be purusing distinct priorities from me, I think they are also trying pretty hard to make MATLAB as fast possible, perhaps in response to Julia. For me, the miracle is that a quick port to Julia is actually possible and does work at all. That is quite rare.
My frustration with the MATLAB and Python approach is that eventually I hit a hard wall in terms of performance in spite of deploying a bag of tricks. While it can be fun “vectorizing” code or finding optimized code paths, this has it’s limits. The solution then is to build a MEX or C extension (or Cython, Numba, JAX, etc.) that creates a new “fast” path, and then use that.
What I appreciate about Julia is actually that the 1.5 language option exists. What I also appreciate about Julia is that the 1.1, 1.2, 1.3, … 1.99 language options also exist. It is possible to iterate Julia code gradually, and I can choose where along that path I want to stop. I can throw in a few @view
statements, and now my code allocates less and runs faster. I could also goes as far as writing inline LLVM IR. Importantly, there is a middle ground where I can use someone’s else LLVM IR wrapped into a API as is the case with SIMD.jl or LoopVectorization.jl.
I do program in a few other languages particularly Python and Java these days. The tragedy I see there is that open source projects can often get stuck because some of the most dedicated contributors lack the skills to fix the parts involving a second language. My most recent example of this is zarr-developers/numcodecs which is wrapper for compression codecs. There the underlying compressors went without an update for three years.
The deployment problem is vexing, but this reflects that Julia is now in a very different place than where it started. The largest artifacts of this is the large “standard” library and rather large Base
module. Where Julia started was a new solution for scientific or technical computing, so it made complete sense to have linear algebra available and loaded as quickly as possible. Now we are pushing Julia into some general purpose computing or even embedded applications that do not need matrix multiplication. Progress is being made in this area by turning “standard” packages into discrete but also “default” packages. In doing so there, there will now be the option to exclude these packages from a deployment.
A Base
replacement may eventually be needed. As Jeff explains there is an awful lot there that may not be needed in every application. A “static” micro-Base may be needed to solve this in the future. This is not a fundamental problem, however. Prototypes such as StaticTools.jl or WebAssemblyCompiler.jl show that there are a number of potential solutions available.