Memory layout of structs with mutable fields


I was thinking of asking for a valid chunk of heap memory of the size for the amount of elements I want to fill in, and then copy paste the previously created ones into it.


As far as I understood, this really is foremost about implementation issues, not about tradeoffs.

One of the most relevant cases is

struct some_wrapper{parentT, extra1, extra2, extra3}

where parentT is non-bitstype, e.g. Vector{Int}.

Such objects should always be stored inline, both in arrays and in other struct or mutable struct, and only ever get heap-allocated in non-inferred code. There is no trade-off: We simply pay an extra object allocation plus pointer indirection for no gain.

I would be super happy if struct with a single non-singleton field could be special cased to simply be the pointer to the parent; then, all such abstractions would be guaranteed to be always zero-cost in type-stable settings (and afaict this would not even need changes to garbage collection).


You can’t store references that way


Because? I actually have trouble finding what references exactly are in Julia, I thought they were kind of like gc registered pointers, can’t I move them around?


Well, as I said above, there’s certainly cases where the performance issue is not a problem but that doesn’t make it not a tradeoff.

First note that that doesn’t fix the OP’s question.
And this definitely come up in related discussion (I remember talking about it in at least 2 occasions).

It’ll be a very strange API. What stops you from stopping at one field for example. It’s almost certainly still a win for 2 fields. Additionally, Tuple of one element and of two element are going to have completely different layout.


Yes they are managed by the GC so you must not move them around without letting the GC know about it.


I guess I’m not meant to interact with the gc in any way?


Well, even if you don’t interact with GC in any way the GC will still interact with you by freeing the objects behind you. Even if you keep them alive elsewhere the code you write won’t be directly comparable since it’s missing the few extra instructions when you are interacting with the GC.


Is the basic issue with moving references around that the GC tracks them through pointers to their memory position? Can I repoint those tracked pointers? – I’m getting a little offtopic


Errr I’m actually not sure what do you actually mean… But in any case, the answer is most definitely no. We don’t have any julia API to manage object references. (other than accessing/assigning fields/arrays)


So what? Let the perfect not be the enemy of the good. Adding type information to some parent without adding data is a common enough pattern, and the performance tradeoff is completely unambiguous.

big_payload::NTuple{500, UInt64} 

is a possible counterexample to deciding inlining on the number of fields. Furthermore, if multiple fields can get inlined, then somebody would need teach the garbage collector that arrays can contain a mix of pointers and non-pointers. For this special case, this issue goes away.

The single-pointer special case would not distinguish the layout of one and two-element tuples; for example, types like Tuple{Vector{Int}, Nothing, Missing, Val{42}} or Tuple{Tuple{typeof(sum), Vector{Int}}, Val{0}} would still be inline (because we only have one non-singleton element). Singleton fields are important for parametric types that have a field <:Function: You must have this as a field in order to correctly work on closures, but in the case of standard functions, it is a singleton that is not stored at all.

The next generalization would be: struct that is pointer-homogeneous (all fields are concrete (esp. non-Union) references or singletons, and size < N) get inlined (apply recursively). There is no possibility of instantiating objects that would become circular, so this is well-defined. This would still require no change in the garbage collector.

edit: The proposed optimization could be alternatively described as: Always collapse immutable pointer chains. This would never increase storage demands, and would not require gc changes.


That’s a very bad arguement. It’s not perfect vs good here since it’s not just a simple optimization. You WILL get people confused by doing this.

So basically what you are proposing is completely irrelevant to this thread.

That’s trivial. The codegen and runtime part is much harder.

What? I’m talking about Tuple{BigInt} and Tuple{BigInt,BigInt} have completely different behavior.

And that’s exactly the bad direction.

Why not? struct A x::A y::A end?

I don’t know why everyone keep mentioning the GC. The GC is actually the easiest part since it’s very strictly controlled code. The runtime and codegen handling is much harder. Still, most of the pieces are there but it’s just not a good time to make the change and partial changes like what you propose here is just very “dirty” and without a clear definition of where it should stop, making it unpredictable.


It seems that right now, to do what I want to do, the easiest way is to just create indeed the version with a pointer to arrays stored separately, and manage the validness of those pointers myself.


Don’t do that. It’s at most a premature optimization.


I agree, it’s more a theoretical idea.


Now depending on your actual usage, you might be able to use two separate arrays for the two fields, i.e. StructOfArrays. That is still a pre-mature optimization if you do that now but once you know everything about your usage it’s very an valid optimization that you can apply, unlike the one you proposed, which seems to bypass everything in julia.


And I’ll add that even a single field will have the same issue. So one additional minor argument is that Jeff really didn’t like the rule related to circular references even though we proved that it can be decided at type definition time.


And how do you instantiate an object of type A? You would need an inner constructor that permits uninitialized fields (and whether fields can be uninitialized is part of the type information).

Clear definition: Collapse all concrete immutable always-initialized pointer chains. Whenever you would store x::T and T is concrete, immutable, non-isbits, cannot have uninitialized fields and has sizeof(T) == sizeof(Ptr{Nothing}), then store the contents of x instead of a reference to an object of type T with the contents of x.

Sizes of structs or tuples, whether they are pointer or bitstype, sizes of fields, etc all do not change. Sizes of arrays, and whether they are pointer-type do not change. Only change is that intermediate objects in pointer chains that can be reconstructed from type-information become implicit instead of explicit.

But obviously you know far better how difficult such a change would be!

I am only arguing that this optimization/change would be an unambiguous improvement that would affect a lot of code.

If such an optimization is beyond current dev resources, or too difficult, well, that’s a very good reason for not having it! Or if the reason is “that would make the C interface too complicated”, sure. But I completely disbelieve claims like “this would be an ambiguous performance trade-off”.



No it’s not. It’s only used for optimization now and not part of any API.

I agree, I’m only saying it’s a big and “dirty” change, and together with all change of similar nature, it’s not a good time to do it without doing other changes first.


Thanks for explaining that!

I’ll disagree on “dirty”, but more interestingly: Could you elaborate on the other changes that should come first?

One issue is what you brought up: whether a field can be null wants to become a proper part of the type information, not just a mere optimization: struct foo x::foo end is uninhabited / has no instances, i.e. is equivalent to Union{}, and struct bar x::bar bar()=new() bar(v::bar)=new(v) end is effectively a peano-style natural number. Both are semantically very different types, and that wants to be reflected in the type information. That’s just an issue of making this info (that is already accessible) part of the official API, and potentially fixing some user-facing syntax (e.g. type definitions would get an optional @dereferenceable x::T or @nullable x::T to make this part explicit for people who don’t want to rely on the compiler to decide whether a field can be nullpointer)

A second issue that you explained to me some time ago is that access to possible nullpointers maybe wants to be handled via cpu trap, instead of runtime checks and throws. Big change for codegen that waits on better llvm support, but I don’t see how this directly relates.

What else are the changes that should be done first?