Thanks, very insightful. It‘s good to have different options for different use cases, and it‘d be great to have an archetypal ECS in Julia. I‘m definitely also interested in agent-based models so looks like I would be well-served by the latter.
I’m looking at the architecture of Ark and it seems really cool, I think that creating an ECS like yours in Julia would be awesome, I will try to take it as inspiration to expand/improve the libraries I mentioned.
But I wonder why you say
Components should be immutable, so that they actually end up stored in component arrays, and don’t escape to the heap as mutable types would. This means that after updates, components must be replaced. Not nice for the API, but well…
Why don’t just use vectors as components? E.g. a design similar to GitHub - JuliaArrays/StructArrays.jl: Efficient implementation of struct arrays in Julia seems good to me
Let me check my understanding @mlange-42 please correct me. I think StructArrays could be used, I think the question of mutable vs. immutable has more to do with what is stored in these vectors (the component data). A system will iterate through the archetype (which may be a StructArray). Then it takes a row and either mutates its fields in place (mutable), or it replaces the entire row with new data (immutable). The question then becomes how you design the systems API. But I believe that thanks to Accessors.jl, you could have mutable API even for immutable component data.
@Tortar I mean individual component objects/structs/classes, not the arrays/vectors/columns per component (type) where they are stored. These would of course need to be mutable. I really have no experience with Juila yet, but I’ve read that all mutable objects are allocated on the heap, and that only references would be stored in the array/column that is intended to hold them. This would be really bad for cache locality.
@simsurace Your understanding of the concept is correct, but I think StructArray might not be the best solution for archetypes as far as I understand it (I mean StructArray), if possible at all:
- You don’t know the composition of archetypes in advance, but they need to be generated at runtime when required by a new combination of components on an entity. The column types of an archetype are only known at runtime. (BTW @abraemer this is what Flecs uses type erasure for.) Ark creates these columns using runtime reflection and uses (unsafe) pointer arithmetics to access them (behind the scenes, but it is typesafe from the user’s perspective, and plays well with the GC). Not sure yet how this can be relized in Julia, but there is certainly a way.
- The columns should be able to contain object/structs (i.e. components), not just primitive values. Not sure whether this is possible with StructArray.
- It should not be necessary to update the entire row, updates of individual components should also be possible. Keep in mind that an entity might have dozens of components (or even hundreds in extreme cases), while a query usually only accesses only a few of them, and modifies even fewer. So updating all components of the entity would be quite vasteful.
Ah okay, thanks for the clarification, then I was misreading your sentence.
You don’t know the composition of archetypes in advance, but they need to be generated at runtime when required by a new combination of components on an entity. The column types of an archetype are only known at runtime.
Is this because you can add/remove components from an archetype at runtime? If yes, I think that indeed it’s a bit challenging in Julia to make it performant because then the type of the archetype won’t be stable throughout the runtime.
The columns should be able to contain object/structs (i.e. components), not just primitive values. Not sure whether this is possible with StructArray.
This is definitely possible with a StructArray because the types of the vectors inside it could be anything.
It should not be necessary to update the entire row, updates of individual components should also be possible. Keep in mind that an entity might have dozens of components (or even hundreds in extreme cases), while a query usually only accesses only a few of them, and modifies even fewer. So updating all components of the entity would be quite vasteful.
Also possible because each field in a StructArray is a vector, so you can update one field at a time without problems.
Regarding your fist point @mlange-42 I think you could take a look at GitHub - JuliaData/TypedTables.jl: Simple, fast, column-based storage for data analysis in Julia for inspiration if I understood what you said correctly. Maybe it would be good to support two separate cases, one where the components of an archetype are known at compile time (they don’t change after creation) and another where they can, I think that in Julia the first version will be a bit more performant than the second, judging from that library and general concepts of the language.
Strictly speaking, yes, you could know archetype composition at compile time if you are only allowed to create entities with a static set of components. But adding and removing components at runtime is really an important feature of ECS. Without that possibility, usage would be way more limited.
Regarding StructArray, I am not sure if it would be possible to create the required struct in the fly, from a call to a function with generic parameters, like CreateEntity{Tuple{Position, Velocity, Health}}(world).
Archetypes are stable despite the possibility to add/remove components. An achetype does not change when components on an entity change, but the entity (i.e. the entire row) is moved to another archetype (which must be created if it does not yet exist).
Defining a struct is not required with StructArrays, NamedTuples are sufficient and you can create new ones at runtimes (however if they get very large you might have some issues). If I understand correctly, when you create an entity with a new combination of components, you have to create a new archetype. Let’s say you already had entities with positions and velocities, so you already had an archetype (let’s say, stored as a StructArray or whatever) with two columns. The two columns would just be vectors in memory, so you could pass them on to the constructor of a new archetype with an additional component. Presumably you also defined some systems that know what to do with these components, and this is probably something you do before running the simulation. Calling the functions that describe the behavior of said system might incur a one time penalty the first time, but then be performant. I think actually Julia’s design could fit this kind of problem very well. Knowing everything at compile time would just be a special case allowing some further optimizations. I think somewhere you need to assume that you don’t introduce new archetypes at every time step, otherwise you would need to compile new stuff at each step. But I think real-world systems don’t have this kind of structure most of the time, so this worst case analysis is probably misguided and you would
I think what you are saying is accurate, still, if creating an entity requires a new struct array, then you’ll have some type-stability problems, e.g. how would you store the archetypes? What type will have the struct storing them? If new archetypes require a new combination of components, then you’ll have the problem that the struct holding the archetypes won’t be type stable (unless you specify all the combinations at compile time but this is unfeasible I guess). I think it’s the sort of thing dividing DataFrames.jl and TypedTables.jl.
Yeah right, I’m still figuring this out as we speak. Certainly some challenges need to be overcome. Probably a lot of things need to be typed loosely but you want to avoid looking them up at every step unless they change (which hopefully does not happen all the time).
Archetypes won’t be created freqently, they usually stabilize quickly. You would also not remove them if they become empty, as they are most likely required again later.
The two columns would just be vectors in memory, so you could pass them on to the constructor of a new archetype with an additional component.
Not sure what you mean here. The new archetype does not require the columns of the old one. Or do you mean that, when adding another component to an existing entity, you can pass the existing component types over from its old/original archetype?
Maybe I still misunderstand the basic idea. I think I‘ll need to look into some toy examples in Ark or some other existing ECS.
What I did want to say was that in your example you might have a vector holding all positions and a vector holding all velocities. Presumably if you add health to your world you will add a vector holding all healths. But I‘m probably confused as to what would happen afterwards.
The reason is that I‘m talking implementation before having a clear grasp of what the behavior should be, so I‘ll pause here.
If I’m guessing the misunderstanding correctly, I think the idea is that adding a component to an existing archetype creates a new archetype, but there is by default no migration of entities from that archetype to the new archetype. That’s why @mlange-42 was talking about passing types, that are the only thing which would be needed to construct the new archetype. This also means, now that I’m thinking about the implications, that each archetype can be type-stable, but the collection of archetypes can’t.
Not necessarily. When you query the ECS, you say what type you want out of the ECS, right? That is enough to recover type stability
E.g. your ECS could carry a dictionary or whatever that maps element types to struct arrays of that type, and then when you query the ECS for e.g. entities of a certain type, you use the type information you have to provide type asserts so the compiler knows what it’ll get out of the dict
If I’m guessing the misunderstanding correctly, I think the idea is that adding a component to an existing archetype creates a new archetype, but there is by default no migration of entities from that archetype to the new archetype.
If you would add a component to an existing archetype, you would add that component to all entities that are in the archetype. Which is not what you want. You want to add a component to a specific entity, which may require a new archetype (if it does not yet exist), and the entity is moved there.
When you query the ECS, you say what type you want out of the ECS, right? That is enough to recover type stability
Well, not really. In ECS, there is no such thing as a “type of entity”. You query for specific components, e.g. Position and Velocity. The entities you get may have further components, but we are not interested in them.
When designing for ECS, you have to get rid of the concept of “this entity is a … [e.g. bullet]” / “do X with all … [bullets]”. You have to think in terms of “do X with all entities that have …” (e.g. Position, Velocity and Impact for the bullets).
It seems to me that if a system that cares about a fixed set of components (I will presuppose that, maybe it is not always true), say Position and Velocity, holds references to all archetypes that contain these two components (plus optionally other components), or better references to views into those archetypes, iteration over those views could be made type stable. When entities move between archetypes because they change other components, the system does not see this, it just gets the updated archetypes at the next step.
EDIT: sure, the container holding all of these archetypes is abstractly typed, but it should not be in the hot path. But to fully convince myself of this I think I would need to think about the user-facing API first and then fill in the details.
Maybe a way to collaborate on this, @mlange-42, if you are interested, would be for you to draft a Julia API of something similar to Ark with your experience that we could iterate, and then we can fill in the implementation together trying to make it as performant as possible.
I think I have figured out a way to handle all this. Give me one or two days for experimentation, and I will share a draft. Just one question: how much overhead would this conversion be:
function _get_storage(world::World, id::UInt8, ::Type{C})::_ComponentStorage{C} where C
storage = world._storages[id]::_ComponentStorage{C}
return storage
end
where World._storages is:
_storages::Vector{Any}
I guess it is not a problem, as the type C is known?
Does storage contain the same component e.g. Vector{Float64} or is storage a sort of archetype? In any case this will have some overhead (tens of ns), but in the first case I think the type stability will not propagate after the function call, otherwise it will. In any case, I think as @simsurace said, a more concrete MWE would help in understanding what is needed to isolate the type instability.
Each storage can only contain one type of component:
struct _ComponentStorage{C}
data::Vector{Union{Nothing,Vector{C}}} # Outer Vec: one per archetype
end
function _ComponentStorage{C}() where C
_ComponentStorage{C}(Vector{Union{Nothing,Vector{C}}}())
end
function _ComponentStorage{C}(archetypes::Int) where C
_ComponentStorage{C}(Vector{Union{Nothing,Vector{C}}}(nothing, archetypes))
end
My idea is to invert the architecture, so archetypes look like this and only reference storages:
struct _Archetype
entities::Vector{Entity}
components::Vector{UInt8} # Indices into the global ComponentStorage list
mask::_Mask
end
Sorry, but I think the API will look different compared to the one for Go, so I need to experiment a bit and can’t make a concrete suggestion yet. This are my first ever lines in Julia ![]()
But I can already create entities with (uninitialized) components in my prototype. I will make the repo public the next days, so you can have a deeper look.