Anyone interested in coding a Julia ECS?

If you know me, you know that I often make lots of attempts at trying something, and fail at many of them, many times before I even got to coding because I realized it would be too big or I wouldn’t win or whatever.

Here goes another attempt!

If you believe or wish that Julia will be the best tool for making games, you are not alone. If so, let’s carve out the path to glory together.

Let’s start simple, with a humble thing, Entity Component System.
This ECS would derive its performance from the fused update function and, once it’s done, the LoopModel library.

Unfortunately, I still have some problems with my ideas, for example, how to delete or change the archetype of entities efficiently and so on. If you’re up to the challenge, I invite you.

If a team can be made such that the project has a chance of being viable, I’ll start coding. If it’s not, then I’ll bury this project for maybe a few years and perhaps dig it up when the circumstances have changed. You do not have to start coding right away. Once I get confirmed team member(s), I will start programming what I can as the frame and then you can add more features until we get a rich library. An ECS can be small (a few hundred LOCs) or really large (like flecs). The size of this project will depend on a number of factors like features supported, our capabilities, etc.

I may not be a very smart programmer here, but Julia gives me the best chance of making a game I want come true, so, I will push even with my puny strength.

P.S. I’m sorry for everyone who tries to support me. I may not be able to complete most things I set out to do, or had to cancel them for various reasons, but I have to keep trying.

2 Likes

Did you check GitHub - heyx3/Bplus.jl: A modern OpenGL 4.6 rendering framework, written in Julia.? Seems like they have implemented an ECS here: GitHub - heyx3/BplusTools.jl: Extra add-on systems for the Julia game framework B+, I’m not at all an expert in this field but maybe it could be a good starting point?

1 Like

Actually there are more:

what about contributing to one of the systems? My first impression is that Overseer.jl seems to be in good shape. Or do you see shortcomings in those implementations?

1 Like

They’re good. However, Overseer is a sparse ECS, not archetypal ECS, which is what enables my optimization. The B+ ECS is explicitly (by the developer) not performance-focused. Mine, on the other hand, is so performance-conscious that I’ve yet to figure out many key features because I’ve yet to find a performant way to implement them, which is actually why I’m asking for help, to find ways to efficiently implement various features.

Anyway… let’s continue the discussion here, where it’s more relevant.

The issue is that if you just try to make it work without trying to make it fast at all, you could store every entity as a dictionary and store every update as a function that checks each element whether or not they have the right components. The reason you need a complex ECS is because you need it to be fast, otherwise, you could’ve stuck to OOP or sloppily implemented ECS.

This doesn’t sound right in my ears because fast doesn’t need to be complex necessarily. Perhaps just phrasing though.

I also agree that the order is not quite as static as presented. Of course you should strive to make a reasonable guess what architecture of your code would be able to solve your problem. A bit like a painter has a motive in mind when they start painting. But still you don’t need to the flowers in the top-level window of the fifth house in the background if you didn’t put down some sky color yet.
So what I want to say is this: These types of “mad stuff”-hyperoptimizations sound very much like they are way too much for reasonable performance. Try something simpler but reasonable first. Maybe just good ol’ runtime computation, maybe DynamicSumTypes.jl, maybe make a post and try to be as specific as possible (perhaps with some demo code even!) what the problem is you want to solve to get some more ideas. But then go and actually focus on getting something that you can run to evaluate! Perhaps you will find that your reasonable approach is godd enough already. More likely there some bugs in the implementation that need fixing first and afterwards you realize that performance is good enough. If it still isn’t and you have the numbers, then you can come back here with (a simplified version of) the code and the numbers and will find many folks willing to optimize the heck out of your code. You are not alone if you don’t want to be :slight_smile:

Okay, I’ll give you some explanation as to how I ended up here.

Imagine you want a game with as many entities as possible. (Players/modders want as many entities as possible, so there is no “enough” here, only as good as possible.)

Now, each entity can have types, status effects, etc. If you store all entities with different types in the same array, you get type instability. Maybe you could do some virtual function tricks, and that’s OOP. However, virtual function calls are expensive for things that are supposed to be like a single update, like “if on fire, deal 5 damage to the entity.” So, what’s the solution? Store entities with the same combination of traits in an archetype. Now, you can loop over the entities that have the same data structure. Really fast. You can also do SIMD updates and so on.

However, here comes another realization. If an entity is on fire and poisoned, for example, and each deal fixed 5 damage, you can combine them into a single update! This update fusion can also add performance.
Another issue is that the compiler can sometimes do SIMD optimization, however, the standard compiler isn’t quite good at that, so, you’ll need something like (the now deprecated) LoopVectorization to ensure vectorization.

And well, the entity updates are supposed to take the most time. Entity interactions/promotion/etc are sparse, so you need to update them less, but you’d still need to support them somehow, and I have yet to figure out how.

And another thing is that I’ll leave the update fusion task to the LLVM for now. However, one could realize that the rabbit hole goes pretty deep. For example, if one update is v → A1v and another update is v → A2v, where v is a vector and A1 and A2 are fixed matrices, you could also precompute (A2A1) to squeeze out performance. The rabbit hole goes pretty deep. (Perhaps I could add XLA backends or options to store some data in a CUarray (GPU) for performance?) But I’ll get to that later on. As of now, there are still a bunch of issues I’ve yet to figure out, which is why this project is still stuck. My goal is to provide the fastest ECS in order to lay a strong foundation for Julia game devs. To be honest, many game devs won’t need such speed either. Many games are bottlenecked by rendering and not entity simulation, however, for games that do need fast entity simulation (and perhaps even willing to compromise some graphics quality for), including the game I (and I believe many others)want (and currently doesn’t exist, nor will likely come out of any known major studio or indie developers out there), a fast ECS would be really valuable.

That’s not correct in general but depends on how you design your code. There are many ways around this.

Don’t think this is a reasonable approach. I think you forgot the combinatorial explosion. Say you have 10 traits and each has like 5 possible behaviors/values then you could potentiall generate 10^5 archetypes and for each archetype you’d need to compile specialized functions. This is a huge investment of RAM and also might not even be faster execution wise because the processor has a finite bandwidth for fetching instructions as well (and this competes with the memory bandwidth because the code lives in RAM just like your data).

Conversely, if you don’t comoile special version for each combinations then you just need 50 small functions. And I don’t know why you say that this approach needs to be slow. Dynamic dispatch is slow yes, but you don’t need that. Done right (e.g. with DynamicSumTypes.jl, FunctionWrappers.jl or similar approaches) your “dispatch” can be just a bunch of conditional branches. While not the cheapest instruction possible this is still rather fast (and can get essentially free if you manage to structure your things such that the branch predictor has few misses).

Yes, that’s correct. However, archetypal ECS operates under the principle that entities will not take on all possible combinations. For example, if you have 900 pixies, 2 orcs, and 1 dragon, assuming no status effect, then you’d have 3 archetypes. In the worst case, you will have as many archetypes as the entities. In practice, many entities will share the same archetype. Again, think of warriors fighting on a large battlefield or something along that line. Maybe some of them would be afflicted with status effects, but these are likely highly correlated.

Flecs is an archetypal ECS in C. It stores entities in archetypes. It doesn’t actually compile fused updates like my plan. (Though a manual fusion benchmark suggests that if possible, fusing updates would actually increase performance) Initially, I had the same apprehension as you do regarding combinatorial explosion, but in practice, Flecs works.

I had a brief look and IIUC then flecs uses these archetypes as an organizational tool for the data essentially. I does not compile specific versions of functions for every archetype. The linked blog about the archetypes explicitly mentions type erasure to make things more uniform.

I think that your plan of specializing on every archetype is bound for failure. The main reason being the huge overhead of actually compiling methods (because the compiler is slow). So you’d never want to compile new methods on the fly because that for sure would cause a lag spike. Which means you either need to figure out which archetypes can possible exists (likely impossible) or precompile every variant which is also impossible.
Additionally, I am not convinced that there is much optimization to be gained from having specialized variants for different trait combinations considering the additional resources it takes. You mention a couple of easy cases where you can save like a few instructions but a few instructions. But I think in general it will be very hard to figure out these optimizations reliably. It is also not quite necessary because it could be done by hand, if a certain combinations arises very frequently. Then the game programmer could profile their game, see that there is an optimization opportunity and introduce a new combined trait (i.e. “PoisonedAndBurning” which gets applied whenever something burning also gets poisoned or vice versa).
In summary, I think too much specialization does more harm than good. But I totally understand the appeal. From a theoretical perspective it seems so nice and clean just to offload everything to the compiler but in practice it is likely a bad idea.

I understand the issue. One problem is that you likely don’t know in advance which combination arises frequently before it happens. Your game engine might support various types of troops for example but you won’t necessarily know in advance what kind of loadout would be equipped before the game gets run or what kind of spells would be cast. Perhaps there would need to be a cost model where you only compile fused updates if a large number of entities are present. That being said, let’s say that the entities update at 60 fps and the archetype lives for 20 seconds, and there is only 1 such entity. Well, in such the worst case…
I’ve compiled a simple function using RuntimeGeneratedFunctions.jl and it cost about 5 μs, so, 5000 ns. On the other hand, adding two numbers cost me about 17 ns. If you could save 1 addition per update, that’s already worth it.

The real issue is that the function, when compiled, stays in the ram forever though, outliving its archetype.

However, this is the real issue that prevents my ECS from getting implemented. Unlike other ECSs, mine does not have an indexable ID, each entity is an anonymous entity. This means I have no easy way of expressing things like “These two entities interact with each other.”