Originally published on Reddit, got advised to post it here as well.
TL;DR
Great community. Excellent at expressing “math”. Very fast language. Almost great REPL. Immature ecosystem. Inconvenient debugging. Bad code organization.
Background
Recently I finished a decently sized Julia project (~3100 LoC), and I’d like to share my experience of using the language.
It’s mostly about developer experience, so I hope these will provide insights for language developers and users.
Before this project, I’ve already used Julia plenty of times, mostly to analyze some experiment data generated by other programs. They are generally fairly clean, because I want the data to be as easily usable in as many languages as possible, but they can be very large. I would generally implement some sort of streaming analysis, then generate various summary tables or plots.
I think Julia is pretty good in this kind of tasks (with some caveats, see later section on IO), but this project is different. All data are generated within Julia and analyzed in Julia, requiring more careful planning.
Project Overview
What I am building is a “one-off” racing game simulator. I have pre-made a set of rules, a definition of the racing track, two characters, and their skills. I have a fairly peculiar goal for this game: I will roll up some sort of AI for each character, then iterate through a set of random seeds, and I will see if Player A beats Player B, ever. Then I visualize the results and shelf the project. Therefore, I am writing code that only needs to be generic enough for this one specific match, and I don’t intend it to be usable in any other situation.
There are four significant tasks of the project. First, I must import data of racetracks and characters into Julia. Second, I need to implement the rule sets of the game. Third, I need to implement AI for each player. Fourth, I need to visualize a finished game. Each of these tasks gave unique challenges and stresses different Julia features.
The racetrack is defined as a triangle outer bound, and Bézier curves on each corner. I did not use any existing Bézier libraries, but implemented the Quadratic Bézier formulae directly. To make it usable in a game, I must implement several data conversions: track position to world coordinates (i.e. arc-length parameterize), altitude, gradient, and curvature. Due to Bézier curves don’t really have a closed-form solution of arc-length parameterization, I ended up using a numerical solution and just cached the results at 0.1 mm precision and lerp in game (which turns out to be a bit bugged if the input is exactly on top of a sampled point… but thankfully it didn’t happen in game). Resulting “acceleration structure” took about 760 MB in memory.
The rule set of the game is implemented in a large Game
structure with several layers of state machines, not unlike any other game. There are two tricky parts in this process: the game must take snapshots every turn for replay, and there must be an extensible API to code the interaction between players, their AI, and their skills, all of which are stateful. Everything in the rule set is implemented in base Julia. Sanity check is provided by having the same methods on each structure meant to represent the same thing (e.g. track definition, bezier representation, and acceleration structure), and overlay them on the same plot. Not the most robust method, but for a one-off program it’s fine.
The AI is a substantial part of the project. An early attempt was to use ReinforcementLearning.jl
to train the AI, but it was too complicated for the project scope, and thus I ended up tailoring hard coded AIs for each player. These hard coded AI command nearly 20 different skills, and themselves have multiple stages, feedback loops, and “mind-reading” (i.e. accessing another AI’s internal state due to… narrative reasons). At the end, each AI is its own finite state machine, using various algorithms, closed-form formulae, heuristics, and PID control to make decision about what speed and what lane it wants to go in this turn. It’s only two outputs and the code is already very complicated! Thankfully I decided against allowing the AI to decide which skills to fire… For something so entangled, I want Julia to catch as many mistakes as possible. I used abstract classes to implement AI and their skills.
The visualization has multiple purpose. It plots the track to show if there’s any data import errors. It shows how each state variable change during a race. Finally, it generates an animation as if we are actually watching the game in real time. This tool is indispensable not only because it helps debugging, but also keeps me motivated as I see each milestone is reached, the data shows. I used CairoMakie
extensively for visualization.
This is not all the details of the project, but those are not important. Let’s start talking about Julia!
TTFX
Time-to-first-X, an ancient problem in Julia due to its JIT compilation. With Julia 1.9 and 1.10 (I started with 1.9, and later 1.10 released, and I upgraded), TTFX is not a big problem. It will still take a few seconds to recompile the package or when plotting the first figure, but long gone are the days when I needed to wait for minutes to start the REPL. I’d say, TTFX is not a big issue now.
Community
The community is awesome. I received so much help from Discord channel and Discourse that without them, I would have never… actually, I would still have finished the project, but with a lot more hurdles. Counterintuitively, questions on Discourse get responses much quicker than Discord, so I recommend using that. The forum format also allows any questions to persist, so others can find an answer later.
Code Organization
Ok…
Code organization needs to be explained in much, much more detail by official documentation. Whatever that’s in there is nowhere near enough, as it took me a long time just to figure out how to have both a package and a top-level script. I also disagree with some advice in official documents:
-
I advise against using
include
. Use single-file modules andReexport.jl
instead. -
I strongly advise against using
nocase
naming. Please usesnake_case
instead.
Read these methods generated by Makie
’s @recipe
:
trackbounds!(ax, track)
trackcornerhandles!(ax, track)
trackdefinitionvisual!(ax, track)
You get the idea.
For those interested, I use VSCode. I have a package set up, whose source file are under src/
, and top-level scripts under scripts/
. During development, I activate the environment of the package and evaluate code cells in these top level script. It served me well so far, except for one issue: I cannot specify development-time dependencies and required dependencies. Therefore, the package’s Project.toml
gets littered with unnecessary deps like benchmark tools.
That’s only the first problem.
Code Structure
The official recommendation of organizing modules, is to have a module file include multiple “sub-files”:
# ModuleA.jl
module ModuleA
include("./a.jl")
include("./b.jl")
end
In my opinion, this is just not a good idea. I use it in this project because it’s the most convenient, but if this of all things is the most convenient organization method in Julia, the language can use some better module system.
EDIT: Following paragraph is factually incorrect. As pointed out by comments, include
checks that a file is syntactically correct. I mixed up with experience about having a function with unmatched end causing issues in the same file.
The main problem is that, Julia code can break due to the order of definition. With this scheme of direct include, the order of inclusion critical. In fact, plenty of errors I encountered were due to seemingly unrelated code in another file. Especially if somewhere there’s an unmatched end
, I could have to go through every single file, spotting line by line, manually, to fix an error that’s reported nowhere close to its origin. Due to how include
works, the scope of an unclosed end
could leek beyond file boundary, and cause problem somewhere completely unexpected.
Another problem is that when reading another person’s code, especially on GitHub or other places where an LSP is not available, it’s very difficult to find where a symbol is defined: if I am reading b.jl
, anything there could be defined in ModuleA.jl
or a.jl
.
I’ll just straight up say that it’s worse than #include
in C. In C, there’s at least forward declaration that allows me to break cycles without shuffling stuff between files. Unfortunately, Julia doesn’t have that.
An alternative is to use one module per file and import by using
. I think this is better, as it keeps related definition close to each other, and while using
doesn’t show what symbols are imported, at least I know that something is being imported from a specific module. Unmatched end
also tend to get caught at module boundary. However, when using small modules, Reexport.jl
is pretty much mandatory, otherwise it’d be extremely tedious to specify everything that needs to be exported on every level of imports.
Another alternative is to break up code into separate packages. It might be a me problem, but I find this very tedious to set up. This is especially true because this is a one-off program: I don’t know the best structure of the code beforehand. This is also similar to research code: I won’t even know what code to write until I run some experiments. Packages are quite inflexible due to how manifests work. I can’t safely rename packages without breaking not just the current project, but the entire local cached registry, due to duplicate UUID and such. It might be a good idea for a large library, but for something in early development, I don’t think packages are a good idea.
If anyone thinks that Julia’s code organization tools are adequate, I kindly ask you to write Rust for a week. Then ask yourself: do you still think Julia’s module system is good? I can read and navigate Rust code straight from docs.rs, with its auto-generated static code listings, and here in Julia I can’t find a symbol with the help of an LSP.
If you want a fairer comparison, look no further than Ruby. It’s dynamic, it’s a lot of metaprogramming, it has pascal syntax. Granted, people do crazy things with Ruby that lead to nested macro and dynamic class insanity, but it has an actually good module system that doesn’t require
(pun intended) including files to function.
Naming
Naming is hard [citation needed]. Let me reiterate: use snake_case
for functions and PascalCase
for types. Please don’t use nocase
, even though the official docs recommend it.
One of Julia’s most powerful tools is multi-methods. Multi-methods that natively supports auto-vectorization. I use these extensively, from defining formal (i.e. abstract class) or informal (i.e. a collection of methods) interfaces. I enjoy the ability to just vectorize a function I wrote, such as:
curves = curvature.(Ref(t), xs)
EDIT: The following paragraph is not correct. As pointed out by comments, global definition will not be overwritten by local operation. Shadowing is still a concern, though.
There are a few caveats related to naming. That is, it is quite easy to accidentally not just shadow, but also change a global definition, if I don’t name stuff right.
Take the following example:
# a.jl
function curvature(...)
end
# b.jl
function some_other_function()
...
curvature = ...
end
Well, apparently after this executes, curvature
’s definition is overwritten, and every other code is broken. The LSP doesn’t catch this very often (see later section: LSP), but if you see that a variable’s color is strange, check immediately. Also for this reason, I start to think that get_something()
is a better method name than something
. Or maybe something_of()
and verb_noun()
.
Related, it is pretty easy to make a mistake when defining multi-methods:
module A
export AbstractA, metod_a
abstract type AbstractA end
method_a(a::AbstactA) = error("method_a() is not defined for $(typeof(a))")
end
using A
struct ConcreteA <: AbstractA end
method_a(a: ConcreteA) = ...
This method_a
is not the same as the one in the module, because the correct way is to define A.method_a
. LSP will not catch this error. I do not really have a solution for this problem, because the ability to define such multi-methods is a major feature of Julia. The best suggestion I have, is to write clear runtime error messages like the one above, so if I encounter an error, I know immediately which method and which type is the culprit.
Finding symbols in Julia is… hard. The ?
command will show every method with the same name, and I haven’t found a way around it. VSCode’s LSP also doesn’t reliably find the correct definition. For a dynamic language, it is probably inevitable. I wouldn’t say that it is more difficult than, say, Python, unless you deal with a mess of include
.
Enums
I used the @enum
macros a few times in my code.
Don’t use them, really. They are not namespaced, there’s no multi dispatch, and there’s no pattern match. Abstract classes are more powerful.
Ecosystem
It’s not very mature. The package manager is good, especially when coming to native libraries (*_jll
). However, pretty much in every aspect I attempted in this project, I have to use some half-dead semi-documented packages. They are so prevalent that there are only a few packages that I would say are not half-dead and are adequately documented:
-
Base
-
Makie
-
DataStructures
-
Random
-
JSON3
-
DataFrames
And… that’s about it. Reading code is absolutely required, and good luck with all the include
messing with scopes.
Julia is surprisingly lacking in the more basic mathematics department. The first open source numerical software I used was Octave, and it had many packages implementing features of Matlab. Then there’s Scipy and Sympy that almost reach feature parity to both of them. And then Julia’s equivalent consist of a bunch of zombie packages, with nowhere near their feature sets. I am talking about basic stuff like statistics, distributions, solvers, graphs, symbolics, signal processing, etc.
When I was implementing the first version with reinforced learning, I had to dig through 10 different packages and locating symbols that cross using
, include
, native code, and Pycall. Later, I encountered a bug in Makie, and its code is no easier to navigate due to the proliferation of macros and kwargs
. For lower-level packages, I can probably work around by implementing parts with my own code, but Julia packages tend to be overabstracted. If there’s a problem in a package, that’s it: it has to be fixed in that package. There’s no way to circumvent them, only hoping that either I can learn the package enough to fix them, or hope that a fix will be provided soon. These are the experience that made me want to never use Julia again (don’t worry, I’ll still use it).
I might sound harsh here, and I shouldn’t be. Julia is mainly a community project, with many contributors donating their free time maintaining the ecosystem that allows me to just ]add Package
. However, I really do not feel safe using many libraries, especially when basic functionality isn’t endorsed by some core team that guarantees their stability.
This is especially true for IO.
IO
Being a math language, Julia needs to work with data a lot. Unfortunately, I think the I/O landscape is a mess. There is no official implementation of CSV or JSON, which are pretty much lingua franca of data exchange. Well, there’s DelimitedFiles
in standard library, but it doesn’t work with anything slightly more complicated. There’s TOML, but it’s limited to simply parsing and printing, plus TOML is not a good data exchange format anyway (good for configs). Tar
, while it exists, is nowhere close to Python’s equivalent.
The two libraries that I end up using, were CSV.jl
and JSON3.jl
. They are pretty much universally recommended, so for these particular formats, Julia’s IO is good. I still think something like these need to be in stdlib. Even then, there are some minor issues, such as JSON3 + StrucTypes
clashing with ProtoStruct
.
For other formats, Julia significantly lags behind other languages like R or Python. Recently popular are Parquet and Arrow IPC, which provide efficient binary formats for exchanging large amount of data. However, Julia’s support for either is terrible. Despite being under JuliaIO
organization, these important (I think) libraries remain unmaintained and unfinished. This really tanks my confidence on the organization, honestly.
I know better to not use Arrow or Parquet in this project, because I know how bad their support are in Julia, but the primary reason I got rid of Julia from all my research code, was that it cannot work with these files properly.
LSP
LSP is slow to update definitions and can’t reliably find definitions. This is to be expected from a dynamic language. Sometimes I just have to wait or poke around. The LSP won’t show any information if there’s any type ambiguity. However, there are some places where type annotation is impossible (like loop variables), meaning that the LSP will leave some black holes that require manual tracing. Symbol finding in VSCode is also limited, as it only shows which file each symbol is from, but not their type signatures.
Again, this cannot always be avoided for a dynamic language. Try to annotate as much as possible to alleviate. Otherwise, the vscode plugin is decent.
REPL
REPL is almost great and lisp-like. Evaluating code cells, redefining functions, etc. are part of my workflow. It is always nice to see immediate effect after change, keeping some application state around. For research code, REPL is great…
…except for struct redefinitions. ProtoStruct
and the like cannot always be used, as it causes problems if I have custom constructors, @kwdef
, or StructTypes
(which is needed for JSON3
). This means early in development, restarting the REPL is a frequent requirement.
Another problem I have is that there is no easy way to drop into a debugger from an existing REPL session. I can’t just evaluate one code cell and debug another. I can’t type an expression into the REPL window and enter a debugger. I can only run an entire file. This is very annoying when I have a mysterious stack trace that involves a lot of states and corner cases, because accessing a debugger is inconvenient. I either rerun the whole script, which removes the REPL all together, or litter the functions with print statements. Thankfully both approach work for this project, but I doubt they will be scalable.
I can dream of something like “Debug Code Cell in REPL”. Or even better: have something like the Common Lisp language, where a sub-REPL is provided on an error, where class redefinition prompts you to update existing instances. These will probably be a lot of work, though, so I don’t expect much from these directions.
Revise
Revise is what makes this project possible. If I need to rerun the whole script after every single change I will die of old age before I finish this. However, there are some minor issues.
-
Struct and constant redefinition, which are Julia’s limitations
-
Changing export list does not reflect in Revise, and require a restart
Under the limitation of Julia, Revise is doing very well. Especially helpful is that it removes stale definitions, which greatly reduces the probability of making a mistake. In fact, I think Revise is better than something like Jupyter notebook because of just this point. Fewer mistakes, but imperfect.
Makie
Makie is very powerful and very fast (once warmed up), but not mature. I know it well enough to navigate its documentation, but when I started out I had no idea how to read its documentation. Especially problematic is all those kwargs
used in its API, which require a lot of digging to figure out. I think its documentation needs some reorganization. It is, however, a very powerful library, allowing me to make very complex plots.
I hit a bug related to RichText
, which are mostly undocumented and have no workaround. I also encountered a mysterious stack overflow in one specific REPL session, but it went away after I restarted, so I never understood what happened as there was no backtrace. In more normal errors, Makie also tend to generate backtraces at strange places not helping with debugging, and using its Observable
interface require quite a bit of care.
In general, if Makie has a problem, it is impossible to work around, because its API is wrapped in so many layers of abstraction.
I don’t think Makie is an easy API to learn, because there are just so many interconnected components. Its documentation needs to be as good as Matplotlib’s to be really effective for newcomers, because the LSP and REPL will not be of any help in finding what those ; kwargs
are.
Conclusion
Julia is fast, and has many features that I like, such as a powerful REPL, automatic vectorization, concise function definitions, multi-methods, etc. However, I just feel like Julia still immature, despite having 11 stable versions. What truly worries me are the following:
-
Official documentations recommending use of footgun
-
Official organization (is JuliaIO official?) not exactly maintaining important packages
-
They certainly are maintaining some other packages, like compression algorithms. It just happens that what I needed to use was not maintained
-
Proliferation of underdocumented, unmaintained, half-finished zombie packages
-
Such packages taking up canonical names, forcing the real maintained packages to use
Name2.jl
, etc. -
Tendency to encounter overabstracted packages
Will I continue to use Julia, then?
Before this project, I’d say, yes. Otherwise, I wouldn’t have attempted it.
Right now? Uh, I will use it if I have to, but no more Julia projects from me. Here’s the thing:
-
Python has vastly better IO support, both in stdlib and from third party support.
-
Python’s plotting libraries, or many other libraries I tend to use, have better documentation.
-
Although Python’s package management and module loading is horrible
-
Rust is as fast as the fastest type stable JIT-compiled julia, but has a much better LSP, out-of-this-world module system, much better automatically generated documentation, safer types, and just as good a package manager as Julia.
-
Of course, no REPL or interactive plotting.
Then when is Julia actually good?
-
When user-defined functions are used extensively, for mathematics, and performance critical
-
When Jupyter makes sense, but turns out to be too messy. Code cells + Revise beats Jupyter all day
-
I suppose when macro magic is required, like autodiffing user-defined functions. I have no idea how Flux does it.
Most of what I do is just none of these things.