What materials should I read extending the compiler?

I have some knowledge of metaprogramming from my last package. Now, I need to descend into madness, to figure out how the compiler works and how I can access and modify the working of each stage in the pipeline.

Implementing a performant attribute system in a moddable game requires taking some optimizations not found in typical Julia compilation. For example, if there is a union of 5 types, the Julia compiler might resolve to ANY and let the runtime do the work. However, that’s not suitable for some use cases. I might need it to use 3 bits for type, 64 bits for data, and do some bit tricks to optimize away dynamic dispatches and such. I would need to play with the compiler. What resource should I read.

What you describe is not too much of the scope of typical Julia performance optimization. But using 3-bits for the type and 64-bits for the data is far from the proper solution for such problems, as the level you indicate deals much rather with good usage of caches and SIMD & Co type of optimizations. You could for example look at the packages and pointers here: Optimizing your code (modernjuliaworkflows.github.io)

I would suggest revisiting the @code_* commands and then try to identify precisely what kind of optimization you need. Looking at different outputs will also teach you how the compiler works.

If you are interested, you could try to learn from some HPC/Game programming courses information that might be focused on C/C++ but can help you to understand and write good Julia code as well. I doubt that at this point a modification of the compiler itself is necessary.

1 Like

I mean… Java performs virtual method table lookup, but that’s still far from an ideal way. I’m trying to optimize away virtual method lookup, dictionary lookup, etc… what do you mean by “not proper solution”?

So, off the bat, I rather doubt that messing with the compiler internals is a good fit for what you’re describing here, e.g. your example of

is much more easily avoided by using union splitting, or using something like SumTypes.jl to enacpsulate the union and automate the splitting. Julia is a very powerful and flexible language, and most goals can be accomplished without compiler modifications. This is a bit like trying to learn to do a backflip while still learning to walk.

With that said, let me link to a few resources on the topic in case this ends up being useful to people whose use-cases do involve working with compiler modifications.

The most user-friendly[1] thing here is to use @generated functions which are a powerful way you can specialize code generation based on the input types of a function. Generated functions also let you return a CodeInfo object instead of an Expr and this has super far reaching implications. The section on Cassette of this blogpost: The Emergent Features of JuliaLang: Part I · Invenia Blog has a great explanation.

That leads naturally to the tools built on generated functions, namely Cassette.jl and IRTools.jl. Both of those have pretty good documentation pages I’d recommend reading. These tools are now kinda old-fashioned and frozen in time. They give some cool capabilities but have some fundamental limitations people are somewhat unhappy with.

Next we have the new generation of compiler plugins via the abstract interpreter mechanisms. This is the bleeding edge of compiler plugins, and there’s exciting stuff happening here, but it’s also constantly changing, and very unstable. Here be dragons. There are two big, stable-ish, and modern packages that take advantage of this mechanism for different purposes: Enzyme.jl and JET.jl. Most other newer or more experimental packages that use the abstract interpreter are developed by copy-pasting chunks of code from Enzyme.jl’s compiler passes and tweaking it until it works for their purposes. For example, StaticCompiler.jl and AllocCheck.jl were developed via liberal use of reverse engineered Enzyme code.

For these compiler plugings, there’s not really any pedagogical material, just existing code bases you can try to understand. One nice exception though would be a series of discourse posts by @Tomas_Pevny. I’d strongly recommend checking out this thread: Materials about AbstractInterpreter and this thread: Manual type inference of hand-written IRCode

This information has a very short half-life though and will probably all be out of date in a matter of months, so there’s not going to be any useful pedagogical materials to learn from until it stabilizes.

  1. note: I’m using “user friendly” in relative terms. Generated functions should probably be avoided if there’s a simpler way of accomplishing the same goal ↩︎


In the mean time… @Mason gave you a nice reply :slight_smile: I just copy my text here in case it is interesting :wink:

Data layouts should be multiples of 8-bits, hence, if you add a 3-bit field, it will either eat up additional 5-bits in order to fit, or you need to create a specialized type, such as BitVector. (As @Mason added, something like SumTypes.jl could do the job.)

Anyway, I didn’t want to discourage you. From the little information I know, my advice would be to focus on foundations such as those found in good books on game programming in the language XY (doesn’t really matter, but probably C++ as that dominates game programming). In such a book, you will learn how in general the CPU works and compilers work and how to overcome such issues as dealing with heterogeneous data. Reading such a book will also improve your Julia skills and I think there is no Julia-specific book with such a scope. Of course, you can also learn by doing and try many examples and incrementally improve.

1 Like

Oh, I forgot… they can come from different mods. One mod might add 2 types of projectile and another mod adds 2 more types and suddenly there are five types of projectile and your core game code wouldn’t know in advance. I may’ve made the problem seem a bit easier than it actually is. Here’s what I meant.

Imagine a game. Now, one mod might add an armor system, and another mod might add an armor piercing on top of the previous mod. One mod might add unit shield and another might add unit morale. Suddenly, the unit struct you held suddenly need more space. And you want to make it as performant as if the armor system, morale system, etc… were hard-coded into the game.

I apologize for making the problem look too easy.

This doesn’t change anything I said. I strongly recommend learning more about the regular tools available in the language for your purposes. Customizing the compilation pipeline is very unlikely to be what you need

1 Like

Wait… I don’t actually need to mess with the compiler to do these things? Yay!

The last time I checked, however, Stellaris, a great game BTW, suffers from a massive lag problem because the engine is allowing mods and perhaps has its own mod compiler or interpreter in it, adding overhead. Some games use inheritance to deal with multiple object types, but virtual function calls use virtual method table, which is a performance loss.

The canonical approach (which somewhat unironically is most widely championed in gamedev circles) is to use handles instead of pointers/objects.

In other words, your entities have an integer id, and then each extensible attribute is represented by an array like extraFancyName::Vector{Symbol}, and you access that not by entity.extraFancyName but instead via extraFancyName[entity.id].

If you have multiple types of entities, and it is conceivable that these types get mixed, then don’t express that in the julia type system (dynamic dispatch BAD in inner loops); instead use some bits of your id for the entity-type.

This requires that you maintain global-ish control over entity lifetimes, and almost surely requires you to write dedicated code for batch updates. On the upside, this gives you de-facto arena allocation for all your relevant data, so you ideally produce minimal GC pressure.

This kind of approach works not just in C or C++ but also in julia and even in java (really! That’s my day job! This is not entirely idiomatic java code, but what can we do…).


Oh, that’s one nice way to do stuffs for sure. I think I explored multiple possibilities before.

Still, mods can usually run arbitrary logic in it and to be fast… well… issues.

And I’m quite self-taught on how to make games so I’m quite a bit clumsy.

I don’t think this is an issue that needs a lot of thinking. Just don’t follow fads that piss away cycles like there is no tomorrow. Look at what amazing things could be done with mods on half-life, i.e. the q2 engine, on late 90s technology (cf eg: counterstrike, natural selection, day of defeat).

I have no clue on gamedev. But I somehow question your fixation on julia for that?

Like, I would understand if you e.g. discover your inner antifascist and want to improve the state of a training simulator / games for FPV pilots to help in the war effort, and want to use the recently linked AerialVehicles.jl as a major component. Sure, awesome! But that is no reason for the majority of your stuff to be written in julia.

The language is just a tool. Mix-and-match components you like from where ever you get them.

Ok, this could be very useful in Agents.jl for multi-agent models :open_mouth:

I started out in pygame.
I have an option of going into c++ (pain)
Or I could use Julia.
So I decided, maybe Julia it is?
And yeah… I’m like a blind monster crawling toward something.

1 Like

To give you a pointer that might be helpful: Learning about Entity component systems. This seems to be the common way to design modular and extensible (like with mods and stuff) systems in gamedev. There are implementations in Julia like f.e. Overseer.jl. I’d recommend checking out the Readme of Overseer.jl for an example.

I actually made an ECS. It’s supposedly quite performant but… hard to use.