This month in Julia world - 2023-09

A monthly newsletter, mostly on julia internals, digestible for casual observers. A biased, incomplete, editorialized list of what I found interesting this month, with contributions from the community.

“Internals” Fora (Slack/Zulip/Discourse/Github):

  • Prem Chintalapudi’s Master Thesis on “Reducing Compilation Latency in the Julia Programming Language” – Prem has been doing a ton of work on documenting and speeding up the compiler (discussed in previous issues of this newsletter). Check section 5.4 for things that are yet to be done (and where you can potentially help).
  • There has been a lot of work on making the Julia compiler more pleasant to use from “user space” for people that want to customize compilation. Escape analysis in particular was recently discussed on the internals slack and is now much better documented for 1.11.
  • A recent discussion on what is so special about Julia that static compilation is not yet a solved problem. This type of thread reappears every few months, but there is always something new to discuss as more progress is being made. This particular discussion focuses on runtime features that are difficult to provide in a self-contained static compiled binary (such a binary is supposed to be able to ship without its own copy of an entire julia installation).
  • Mojo, a new tool aiming to be Python but compiled and fast (similar to cython, numba, jax and others, but more ambitious and more universal) had its first public release. There were a couple of interesting comparison threads with Julia in the context of recursion and another one focused on SIMD-ification. The rough consensus seems to be that “very optimized not-readable code” performs blazingly fast both in Mojo and in Julia (some users had Mojo a bit faster, some had Julia a bit faster, depending on their hardware). That type of code also does not look anything like actual Python in either case. My personal conclusion was that super-optimized code is equally difficult to write in Mojo and in Julia, but there were a couple of examples of simple Julia code that performs almost as fast as the super-optimized unreadable versions (simd-ifying complex numbers being a good demo). On the other hand, in Julia you really should write iterative code, not recursive code, while Mojo does not necessarily require that style. Of course, all of this is about microbenchmarks – how easy it is to build a big project is a different question. Lastly, do check out Chris Elrod’s compile-time trick, making the Julia solution for the recursive microbenchmark take practically constant time (a few nanoseconds) – not really a fair comparison, but a fun trick nonetheless.
  • On autodiff: a very long slack thread that started with a discussion on creating a universal set of unit tests and regression tests for all julia autodiff tools, that then turned into discussion of how much of SciML has “secretly” transitioned to use Enzyme behind the scenes already, that then turned into a discussion of where Enzyme is heading in the near term and how Enzyme is being designed to be robust.
  • Another thread on recent exciting improvements to the Enzyme autodiff tool.
  • And a really fun example of how Enzyme works for an incredibly surprising mutating calculation.
  • The Julia compiler recompiles functions/methods as necessary whenever new method definitions are imported. It uses “world ages” as a way to track whether a recompilation is necessary. In that context, yet another piece of dark magic from Frames: a tool that lets you access older versions of functions, from before they were recompiled. It is a bit like the opposite of Revise: it lets you check what a function was doing before you modified it.
  • For when you know you have to do a computation with a ton of allocations in the inner loop, consider Mason’s Bumper.jl which provides a second allocator that is user manipulable and very fast (it is a simple bump allocator for manual memory management without garbage collection). Hopefully Mason will not be annoyed that this is being advertised while it is still stealth/experimental.

Core Julia Repos:

  • The REPL is getting better and better. Check out the new visual hints for autocompletion.
  • Mentioned in previous issues of the newsletter, the public keyword for defining API boundaries is now merged.
  • Folks that want to learn more about “effects” (the compiler’s terminology about what properties it can prove for methods being compiled, thus enabling more aggressive optimization): Check out some of the recent work there, e.g. changes making custom compiler like GPUCompiler able to rely on the default CPU compiler predictions.
  • For folks that want to customize compilation and code generation, this recent documentation PR would be helpful.
  • We do not have an official way to create “standalone scripts”. Of course, we can just write a script and ask a user to execute it on the command line, but for more sophisticated self-contained public releases of executables it helps to define an “entry point” like main in C and if __name__==”__main__” in python. This caused a ton of fun and not-fun bikeshedding. First try, then discourse discussion, backtracking to a second try that caused even more consternation, and finally a third try that folks seem to like. The difficulty was to balance expressiveness, universality, obviousness, and backward compatibility.
  • IdDict is useful when you want to distinguish instances of what might be the “same” object, copied multiple times (for cases where == is not enough to distinguish objects). Now it is twice as fast.
  • Folks interested in the previously discussed “dynamic scope variables” can see how they are already being used to make various configurations properly thread-safe (and generally simplify the configuration management). For instance, multiprecision BigFloat configuration.
  • Also discussed in previous newsletter issues, many Julia routines that mutate their input are not safe when used on aliased input (e.g. “map from array A to array B” when A and B are overlapping views into the same parent array). This is a feature shared by any high-performance linear algebra or tensor library, but it frequently trips up novices. These issues are now being thoroughly documented in every single mutating Base function.
  • Interested in how you can write new optimization passes in the Julia compiler? Check this example of enabling “common subexpression elimination” which teaches the compiler how to simplify user code such that it does not repeat the same computation multiple times. LLVM already does this for us, but it is valuable to start doing it at a higher level, in Julia itself, as the Julia IR (intermediary representation) carries a lot more useful semantic information.
  • Julia Base has a very good Array implementation, mostly done in C. Some work has started on abstracting away the low-level memory management from the higher level Array API, so that some of it can be written in simpler Julia code (minimizing C code to the bare necessities) and so that more “memory manipulating” functionality can be written in julia (e.g. working with simple buffers without depending on Array). This has been in the works for quite a while and is now starting to take shape.
  • PrecompileTools might become a stdlib.
  • The Julia compiler can be started in a mode where it reports each time a compilation is triggered (and the method being compiled is reported). In 51106 this functionality is improved to report also what line of code caused the compilation to be triggered.
  • Yet more improvements to the garbage collector, taking advantage of more parallelism and multithreading.
  • A few more “interactive” stdlibs were removed (here and here) from the sysimg making Julia faster to compile and snappier when launching scripts.

Ecosystem Fora, Maintenance, and Colab Promises (Slack/Zulip/Discourse/Github):

  • The JuliaRegistrator github helper now can neatly add release notes to the github release page of a project. Previously it had this capability documented, but not fully implemented. Make your release pages neater with custom release notes.
  • Documenter reached version 1.0!!! It has some neat new features for generating changelogs and for having bibtex citations (in a separate package), but maybe the most wonderful new feature is that it is “strict by default”, so now your CI can keep track of outdated documentation.
  • PkgTemplates.jl now supports adding Aqua to the test suite at package creation. More test utilities are on the way (like JuliaFormatter and JET). The goal is to make it easier for beginners to enforce good coding practices by default.
  • BenchmarkTools.jl will soon (October?) tag a 2.0 release following a major bug fix for mutating functions. If you have any guidance or suggestions for the maintainers before that happens, please open an issue on the repo.
  • A ton of new visualization tools for complex numbers and functions of complex numbers.
  • Also new tools for function approximations in the complex plane.
  • Julia has some really cool tools for working with numerical data that carries physical units data. In particular, for quite a while we have had Unitful, which encodes a lot of unit-data in the type domain. That has some compilation and generalizability issues, so folks recently started working on a runtime alternative: DynamicalQuantities.
  • There was a joke that everyone who does graph theory in Julia creates their own package. The last few months there has been a wonderful effort to unite these disparate tools (as discussed in previous issues). Now we are setting up regular Graph community calls to discus priorities and developments.
  • CUDA.jl v5 is out! Very few breaking changes, and a ton of improvements. Especially nice are the new profiling capabilities.
  • A new version of TerminalPager is out with a ton of improvements in how you can view long textual or tabular data in Julia’s REPL.
  • Every so often people prepare curated lists of useful stable packages and package comparisons. A promising fresh new effort is JuliaPackageComparisons.
  • JuliaActuary released a new package, FinanceModels.jl, to perform modeling and valuation of financial contracts. It is the evolution of Yields.jl and more can be read at their blog post.
  • Benchmark comparison between Symbolics.jl and SymEngine (a c++ tool that came out of the sympy team ten years ago but is separate): github and discourse.

Soapboxes (blogs/talks):

Sundry:

  • An interesting keynote speech at RSECon about creating opensource scientific communities – mentioned on slack.
  • Check out the winner of the Pluto Notebook competition.
  • The company JuliaHub for quite a while has been setting up a universal documentation portal (in the style of readthedocs) which has various ecosystem benefits. But if you have anything slightly non-standard in your documentation CI, probably the JuliaHub docs for your package will be mangled. The JuliaHub team is doing a lot of work on polishing this resource, but in the meantime you might want to check how your packages are rendered in the centralized documentation repository. E.g. here is JuMP which has broken doc build in that environment. Here is a log from QuantumClifford which also has a broken doc build. You might want to check whether your package works with that documentation platform, and if it does not, you can check the log to see what is broken. Then you can just fix it in your package or opt-out of this resource by saying so at DocumentationGeneratorRegistry. Or, of course, you can just not worry about it and wait for the JuliaHub folks to polish the doc generations over the next few years. edit: JuMP now redirects (thanks to DocumentationGeneratorRegistry) so it does not look broken anymore

Please feel free to post below with your own interesting finds, or in-depth explanations, or questions about these developments.

If you would like to help with the draft for next month, please drop your short, well formatted, linked notes in this shared document. Some of it might survive by the time of posting.

93 Likes

Favorite post of the month :smiley:

New Array moving most of it internals to Julia is a great move.

This also is the stepping stone towards static binaries. Can’t be more excited.

2 Likes

Sadly, input into the REPL is horribly slow for me on nightly until I disable the completion hinting, not sure if this is a known issue? Anyone else getting this? Also it sometimes causes spurious exceptions while typing.

I haven’t tried the latest nightly, but I was really impressed by the completion hint speed when the PR originally came out. I’ve been running this version of Julia with that change and a patch on top (which has also now been merged into master), and it’s been a really pleasant experience even on my low powered laptop, no lag at all.

1 Like

And forward-compatibility, to my understanding. One of the goals was to make it easy to transition this entrypoint into a simple main function (like in C/Rust/etc.) in a future Julia 2.0 (whenever that happens). So that was another factor to consider in the design.

This tab complete hinting is probably going to flush out bugs in the tab completion that weren’t noticed before, but your error seems to be a very recent regression and should be a simple fix (fixed now). Thanks for the issue.

As for it making your repl experience slow on your machine, can you share more information about your machine? I was curious to see if this happened on anyone’s machines but yours is the first I’ve heard of. Please open another issue with details. Thanks!

3 Likes

I’m not sure anymore that this issue is connected with the hinting, though.