GC problems surfaced... there is a reason Julia can't have void references. How do I make a type-stable entity update system?

Imagine I had a list of entities, and an update function.

However, updating them could pose a challenge as the type of each group of entity is different. So, what if I use the function wrapper? But then, how do I store the entity data?

So, a solution is to store data as raw bytes and reinterpret them to whatever data type is needed. So, when I have entities of type A, I could reinterpret the entities of type A to be the type of entities of type A and store the data as raw bytes.

However, a problem emerges, when the entities are not of bit type. When that happens, the GC doesn’t see the pointers stored, only raw data, which means the GC can collect anything.

void pointers are really useful. However, in garbage-collected languages, they can’t simply be done, at least not without runtime types.

What can I do instead of using void pointers?

This should not be the first thing you try.

However, it is very hard to make useful suggestions since you didn’t explain in detail what you want to do. Please give some code that does what you want but is just too slow.

Some ideas:

  1. Dynamic dispatch: This solves the problem but allocates and is slow. But it is a very easy and straight-forward solution. So I suggest you type out an example of your problem using this.
  2. You have a single Entity type that contains the update function as field (use FunctionWrappers.jl for type stability). Now I don’t know where the data lives that this function operates on so I can’t comment on this.
  3. Use DynamicSumTypes.jl and essentially the same code as 1. DynamicSumTypes lets you pack all different types in a common supertype to ensure type-stability but you can still dispatch on the concrete type. I suggest you read the documentation. It is quite fast
4 Likes

Assuming the bytes are for instances of a particular type, do not reinterpret them to other types because dodging whatever validation steps are in constructors is not generally safe. If that data is intended for a particular type but hasn’t been validated, you need to validate manually after reinterpret.

reinterpret doesn’t support non-isbits types. How are you running into “void pointers”? Why do you think that is a problem for garbage collectors, which don’t track isbits types?

1 Like

The update is simple

for entity in entities
    update!(entity)
end

However, entities can have different types(data structures), and the data structure of the entities can be new(unseen by the programmer).
For example, some entities might have positions, while others might have armor, some might have shields, and so on. I group the entities with the same data structures together and compile specialized functions for the entities. However, the “group” still exists in a larger data structure that contains all groups, thus, to have a type-stable structure, you need to store the data as raw bytes and have the specialized functions reinterpret the structures.

Again, perhaps I run into more problems in Julia compared to other languages because I try to push the boundary of what’s possible in Julia whereas I stick to established best practices in other languages.

That just delays the instability to the reinterpret steps, you won’t have any performance improvements. When you need to check an expression’s type at runtime, you either do dynamic dispatch for full flexibility or try to refactor a known set of types into one sum type. This isn’t a Julia quirk, this is true everywhere.

The reinterpreted type is known at the specialized function’s compile time though. The compiled function knows the type of the data. Compared to C, It’s like you have a bunch of arrays, each with its update function, but you can’t store all arrays in a single big array if the type of the arrays are different, so, you store everything as a void pointer and a function that knows the type. This is similar to that mechanism.

You’re missing where the runtime type check is

entity’s type, whether directly in Julia or indirectly through raw bytes with a type tag, changes per iteration, so it has to be checked at runtime. It doesn’t matter if the compiler knows the set of types that entity can take. If entity is raw bytes, the compiler might consider entity in that for loop to have 1 type, but update!'s body has to check entity’s type tag at runtime and reinterpret to a variety of types. That’s what I mean by delaying the instability to the reinterpret steps, @code_warntype just looks red for a callee instead.

This is all moot because reinterpret only handles isbits types, and you still haven’t addressed how you’re running into pointer issues with it.

1 Like

I meant that I would group the entities with the same type together and then use functionwrapper to have different functions, all operating on raw bytes, but each treating the data as different type. Moreover, I don’t mean using Julia’s reinterpret, but rather unsafe reinterpret using unsafe_convert and unsafe_wrap.

You’ll probably get easier help if you give a concrete MWE demonstrating the problems you are facing / what you are trying to solve, rather than having to constantly clarify what you mean.

3 Likes
using FunctionWrappers
import FunctionWrappers: FunctionWrapper

function unsafe_reinterpret(type::Type,raw_data::Vector{UInt8})
    ptr = Base.unsafe_convert(Ptr{type},raw_data)
    
    return unsafe_wrap(Array,ptr,div(length(raw_data),sizeof(type)))
end


struct example_data
    data::Vector{UInt8}
    f::FunctionWrapper{Nothing,Tuple{Vector{UInt8}}}
end

function sample_f(x)
    x2 = unsafe_reinterpret(UInt32,x)
    for i in x2
        println(i)
    end
end

sample_data = example_data(
UInt8.([3,4,5,6])
,FunctionWrapper{Nothing,Tuple{Vector{UInt8}}}(sample_f)
)

function update!(x::example_data)
    A = x.f
    B = x.data
    A(B)
    return
end

update!(sample_data)

This works for data of bits type, however, things get messed up when datas are not of bit type.

I don’t see why you need to do any of this unsafe stuff at all from your example to begin with? I doubt you should be reaching for such complexity right away. Can you address why the ideas from @abraemer are insufficient? Sum types would probably be fine, or your MWE is not fleshed out enough (e.g. you don’t actually show your update loop or different types).

Another question to ask if all this trouble is even worth it. I doubt, even if you got your approach to work, that you would see any performance improvements (it could even be worse) over the alternatives. That this approach clearly fails for non-bits and you want to be using such types is probably a good indicator that this is the wrong approach.

1 Like

Well, the reason is that… well, I’m thinking of making an ECS… it’s a data structure supporting its own programming paradigm.

Here is an example ECS implementation in C.

Here is where it started.

I don’t see what this really explains, e.g. why sum types are insufficient. You should address that. Have you tried it?

I’ll also note that, with a goal this big, it is not likely that you get the interface/implementation completely right on the first go. Just try and get something working first, building up pieces one at a time, before trying to make things so complicated.

1 Like

I think you are confused what “type” is in this context. You conflate the "type"s your entities can have with Julia’s types. While it could seem natural to just map these one-to-one (and would work!) it is not ideal because it will be slow due to Julia’s dynamic dispatch.

Instead of having a type hierarchy like:

abstract type AbstractEntity end
struct EntityWithShield <: AbstractEntity end
struct EntityOnFire <: AbstractEntity end
struct EntityOnFireWithShield <: AbstractEntitiy end

function update!(::EntityWithShield) ... end
function update!(::EntityOnFire) ... end
function update!(::EntityOnFireWithShield) ... end

you should rather use composition like for example:

struct Entity
    traits::Vector{Traits}
end
abstract type AbstractTrait end
struct OnFire <: AbstractTrait end
struct HasShield <: AbstractTrait end

function update!(::Entitiy, ::OnFire) ... end
function update!(::Entity, ::HasShield) ... end

Written like this it is of course slow due to dynamic dispatch, but using DynamicSumTypes.jl you can make it fast.

I understand that you envision having specialized, optimized methods for EntityOnFireWithShield but this is just not a great idea:

  1. Dynamically compiling things is very slow
  2. As you note, you’d need a mechanism to throw out methods that are no longer needed
  3. I don’t think it is universally clear how to combine traits correctly. if they are a simple list then precedence is clearly handled
    Also note that the ECS you refer to does not use a type for every archetype. This is more of matter of organization of the updates. But this organization happens at runtime. So where you go wrong imo is the for entity in entities. Because what flecs does (IIUC) is more like
for archetype in archetypes
    for entity in entities_of_type(archetype)

With this structure perhaps one could realize your dream after all. But I highly recommend starting with a simpler goal, i.e. try to replicate flecs first. And then see how we can add the optimization on top of it.

2 Likes

Admittedly, I want to beat Flecs at least in terms of performance. This is a daunting task, Flecs literally has “fast” in its name. So, I had to… like… plan my move. DynamicSumTypes.jl can work with a few types/traits, but real ECS might need to handle hundreds of traits. Admittedly, one thing I could do is to abandon the idea of supporting non-bits type altogether, because non-bits type can’t simply be vectorized, so is not fast anyway.

You will not get the best implementation right away. That rep’s first commit was back in 2018 - they have six years of refining, it is not realistic. Just try and get something working and then you will figure out where you can improve things by profiling. Trying to get it perfect on the first implementation will only slow you down. Planning is good but perfectionism won’t get you to your goal. You should take @abraemer’s suggestions and advice.

As an example, my package DelaunayTriangulation.jl had a rough implementation on the first go but it was necessary for me to figure out how to improve things and significantly refine it up to now. I initially tried to get it perfect right away but that just meant nothing ever actually got done to begin with.

3 Likes

As another example: DifferentialEquations.jl was not build in a day or a year even. They are still improving every day. It is not important to get everything right from the get-go. You need to start somewhere and then improve. While in theory you could write a perfect ECS directly, in practice you will need to learn a lot of lessons that you’ll only get by trying to build an ECS. So just build something :slight_smile: I am not aware of any Julia ECS so any Julia ECS will be the fastest and best Julia ECS :slight_smile:

2 Likes

Thank you for your advice. Hopefully, I can get to implementing it soon. I think the early versions would only support bits type, and I’ll figure out how to support other use cases later on. The reason I brought up dynamic compilation optimization is actually because it is easy to do in Julia. The early version won’t contain GPU support even though I do plan to get there eventually.

I made a memory pool, that I think may offer a solution to your problem:

Will need to get adapted to your use case, but instead of storing the type + pointer in the array, you give out this MutableReference{T} type, which behaves exactly as the type you’re referencing.
Not sure if that helps, but it definitely can be adapted to be an array type with heterogenous elements.
Of course you may never loose the array to the GC, while operating on the mutable references.

I’m not sure if that supports non-bits types. Still, you now need to manage the memory pool yourself and so on. I was thinking of not supporting non-bits types at all.