Memory layout of structs with mutable fields

I was thinking of asking for a valid chunk of heap memory of the size for the amount of elements I want to fill in, and then copy paste the previously created ones into it.

As far as I understood, this really is foremost about implementation issues, not about tradeoffs.

One of the most relevant cases is

struct some_wrapper{parentT, extra1, extra2, extra3}
parent::parentT
end

where parentT is non-bitstype, e.g. Vector{Int}.

Such objects should always be stored inline, both in arrays and in other struct or mutable struct, and only ever get heap-allocated in non-inferred code. There is no trade-off: We simply pay an extra object allocation plus pointer indirection for no gain.

I would be super happy if struct with a single non-singleton field could be special cased to simply be the pointer to the parent; then, all such abstractions would be guaranteed to be always zero-cost in type-stable settings (and afaict this would not even need changes to garbage collection).

1 Like

You canā€™t store references that way

Because? I actually have trouble finding what references exactly are in Julia, I thought they were kind of like gc registered pointers, canā€™t I move them around?

Well, as I said above, thereā€™s certainly cases where the performance issue is not a problem but that doesnā€™t make it not a tradeoff.

First note that that doesnā€™t fix the OPā€™s question.
And this definitely come up in related discussion (I remember talking about it in at least 2 occasions).

Itā€™ll be a very strange API. What stops you from stopping at one field for example. Itā€™s almost certainly still a win for 2 fields. Additionally, Tuple of one element and of two element are going to have completely different layout.

Yes they are managed by the GC so you must not move them around without letting the GC know about it.

I guess Iā€™m not meant to interact with the gc in any way?

Well, even if you donā€™t interact with GC in any way the GC will still interact with you by freeing the objects behind you. Even if you keep them alive elsewhere the code you write wonā€™t be directly comparable since itā€™s missing the few extra instructions when you are interacting with the GC.

Is the basic issue with moving references around that the GC tracks them through pointers to their memory position? Can I repoint those tracked pointers? ā€“ Iā€™m getting a little offtopic

Errr Iā€™m actually not sure what do you actually meanā€¦ But in any case, the answer is most definitely no. We donā€™t have any julia API to manage object references. (other than accessing/assigning fields/arrays)

So what? Let the perfect not be the enemy of the good. Adding type information to some parent without adding data is a common enough pattern, and the performance tradeoff is completely unambiguous.

some_ref::T 
big_payload::NTuple{500, UInt64} 
end

is a possible counterexample to deciding inlining on the number of fields. Furthermore, if multiple fields can get inlined, then somebody would need teach the garbage collector that arrays can contain a mix of pointers and non-pointers. For this special case, this issue goes away.

The single-pointer special case would not distinguish the layout of one and two-element tuples; for example, types like Tuple{Vector{Int}, Nothing, Missing, Val{42}} or Tuple{Tuple{typeof(sum), Vector{Int}}, Val{0}} would still be inline (because we only have one non-singleton element). Singleton fields are important for parametric types that have a field <:Function: You must have this as a field in order to correctly work on closures, but in the case of standard functions, it is a singleton that is not stored at all.

The next generalization would be: struct that is pointer-homogeneous (all fields are concrete (esp. non-Union) references or singletons, and size < N) get inlined (apply recursively). There is no possibility of instantiating objects that would become circular, so this is well-defined. This would still require no change in the garbage collector.

edit: The proposed optimization could be alternatively described as: Always collapse immutable pointer chains. This would never increase storage demands, and would not require gc changes.

Thatā€™s a very bad arguement. Itā€™s not perfect vs good here since itā€™s not just a simple optimization. You WILL get people confused by doing this.

So basically what you are proposing is completely irrelevant to this thread.

Thatā€™s trivial. The codegen and runtime part is much harder.

What? Iā€™m talking about Tuple{BigInt} and Tuple{BigInt,BigInt} have completely different behavior.

And thatā€™s exactly the bad direction.

Why not? struct A x::A y::A end?

I donā€™t know why everyone keep mentioning the GC. The GC is actually the easiest part since itā€™s very strictly controlled code. The runtime and codegen handling is much harder. Still, most of the pieces are there but itā€™s just not a good time to make the change and partial changes like what you propose here is just very ā€œdirtyā€ and without a clear definition of where it should stop, making it unpredictable.

It seems that right now, to do what I want to do, the easiest way is to just create indeed the version with a pointer to arrays stored separately, and manage the validness of those pointers myself.

Donā€™t do that. Itā€™s at most a premature optimization.

I agree, itā€™s more a theoretical idea.

Now depending on your actual usage, you might be able to use two separate arrays for the two fields, i.e. StructOfArrays. That is still a pre-mature optimization if you do that now but once you know everything about your usage itā€™s very an valid optimization that you can apply, unlike the one you proposed, which seems to bypass everything in julia.

And Iā€™ll add that even a single field will have the same issue. So one additional minor argument is that Jeff really didnā€™t like the rule related to circular references even though we proved that it can be decided at type definition time.

And how do you instantiate an object of type A? You would need an inner constructor that permits uninitialized fields (and whether fields can be uninitialized is part of the type information).

Clear definition: Collapse all concrete immutable always-initialized pointer chains. Whenever you would store x::T and T is concrete, immutable, non-isbits, cannot have uninitialized fields and has sizeof(T) == sizeof(Ptr{Nothing}), then store the contents of x instead of a reference to an object of type T with the contents of x.

Sizes of structs or tuples, whether they are pointer or bitstype, sizes of fields, etc all do not change. Sizes of arrays, and whether they are pointer-type do not change. Only change is that intermediate objects in pointer chains that can be reconstructed from type-information become implicit instead of explicit.

But obviously you know far better how difficult such a change would be!

I am only arguing that this optimization/change would be an unambiguous improvement that would affect a lot of code.

If such an optimization is beyond current dev resources, or too difficult, well, thatā€™s a very good reason for not having it! Or if the reason is ā€œthat would make the C interface too complicatedā€, sure. But I completely disbelieve claims like ā€œthis would be an ambiguous performance trade-offā€.

1 Like

Yes?

No itā€™s not. Itā€™s only used for optimization now and not part of any API.

I agree, Iā€™m only saying itā€™s a big and ā€œdirtyā€ change, and together with all change of similar nature, itā€™s not a good time to do it without doing other changes first.

3 Likes

Thanks for explaining that!

Iā€™ll disagree on ā€œdirtyā€, but more interestingly: Could you elaborate on the other changes that should come first?

One issue is what you brought up: whether a field can be null wants to become a proper part of the type information, not just a mere optimization: struct foo x::foo end is uninhabited / has no instances, i.e. is equivalent to Union{}, and struct bar x::bar bar()=new() bar(v::bar)=new(v) end is effectively a peano-style natural number. Both are semantically very different types, and that wants to be reflected in the type information. Thatā€™s just an issue of making this info (that is already accessible) part of the official API, and potentially fixing some user-facing syntax (e.g. type definitions would get an optional @dereferenceable x::T or @nullable x::T to make this part explicit for people who donā€™t want to rely on the compiler to decide whether a field can be nullpointer)

A second issue that you explained to me some time ago is that access to possible nullpointers maybe wants to be handled via cpu trap, instead of runtime checks and throws. Big change for codegen that waits on better llvm support, but I donā€™t see how this directly relates.

What else are the changes that should be done first?