I was thinking of asking for a valid chunk of heap memory of the size for the amount of elements I want to fill in, and then copy paste the previously created ones into it.
As far as I understood, this really is foremost about implementation issues, not about tradeoffs.
One of the most relevant cases is
struct some_wrapper{parentT, extra1, extra2, extra3}
parent::parentT
end
where parentT
is non-bitstype, e.g. Vector{Int}
.
Such objects should always be stored inline, both in arrays and in other struct
or mutable struct
, and only ever get heap-allocated in non-inferred code. There is no trade-off: We simply pay an extra object allocation plus pointer indirection for no gain.
I would be super happy if struct
with a single non-singleton field could be special cased to simply be the pointer to the parent; then, all such abstractions would be guaranteed to be always zero-cost in type-stable settings (and afaict this would not even need changes to garbage collection).
You canāt store references that way
Because? I actually have trouble finding what references exactly are in Julia, I thought they were kind of like gc registered pointers, canāt I move them around?
Well, as I said above, thereās certainly cases where the performance issue is not a problem but that doesnāt make it not a tradeoff.
First note that that doesnāt fix the OPās question.
And this definitely come up in related discussion (I remember talking about it in at least 2 occasions).
Itāll be a very strange API. What stops you from stopping at one field for example. Itās almost certainly still a win for 2 fields. Additionally, Tuple
of one element and of two element are going to have completely different layout.
Yes they are managed by the GC so you must not move them around without letting the GC know about it.
I guess Iām not meant to interact with the gc in any way?
Well, even if you donāt interact with GC in any way the GC will still interact with you by freeing the objects behind you. Even if you keep them alive elsewhere the code you write wonāt be directly comparable since itās missing the few extra instructions when you are interacting with the GC.
Is the basic issue with moving references around that the GC tracks them through pointers to their memory position? Can I repoint those tracked pointers? ā Iām getting a little offtopic
Errr Iām actually not sure what do you actually meanā¦ But in any case, the answer is most definitely no. We donāt have any julia API to manage object references. (other than accessing/assigning fields/arrays)
So what? Let the perfect not be the enemy of the good. Adding type information to some parent without adding data is a common enough pattern, and the performance tradeoff is completely unambiguous.
some_ref::T
big_payload::NTuple{500, UInt64}
end
is a possible counterexample to deciding inlining on the number of fields. Furthermore, if multiple fields can get inlined, then somebody would need teach the garbage collector that arrays can contain a mix of pointers and non-pointers. For this special case, this issue goes away.
The single-pointer special case would not distinguish the layout of one and two-element tuples; for example, types like Tuple{Vector{Int}, Nothing, Missing, Val{42}}
or Tuple{Tuple{typeof(sum), Vector{Int}}, Val{0}}
would still be inline (because we only have one non-singleton element). Singleton fields are important for parametric types that have a field <:Function
: You must have this as a field in order to correctly work on closures, but in the case of standard functions, it is a singleton that is not stored at all.
The next generalization would be: struct
that is pointer-homogeneous (all fields are concrete (esp. non-Union
) references or singletons, and size < N) get inlined (apply recursively). There is no possibility of instantiating objects that would become circular, so this is well-defined. This would still require no change in the garbage collector.
edit: The proposed optimization could be alternatively described as: Always collapse immutable pointer chains. This would never increase storage demands, and would not require gc changes.
Thatās a very bad arguement. Itās not perfect vs good here since itās not just a simple optimization. You WILL get people confused by doing this.
So basically what you are proposing is completely irrelevant to this thread.
Thatās trivial. The codegen and runtime part is much harder.
What? Iām talking about Tuple{BigInt}
and Tuple{BigInt,BigInt}
have completely different behavior.
And thatās exactly the bad direction.
Why not? struct A x::A y::A end
?
I donāt know why everyone keep mentioning the GC. The GC is actually the easiest part since itās very strictly controlled code. The runtime and codegen handling is much harder. Still, most of the pieces are there but itās just not a good time to make the change and partial changes like what you propose here is just very ādirtyā and without a clear definition of where it should stop, making it unpredictable.
It seems that right now, to do what I want to do, the easiest way is to just create indeed the version with a pointer to arrays stored separately, and manage the validness of those pointers myself.
Donāt do that. Itās at most a premature optimization.
I agree, itās more a theoretical idea.
Now depending on your actual usage, you might be able to use two separate arrays for the two fields, i.e. StructOfArrays. That is still a pre-mature optimization if you do that now but once you know everything about your usage itās very an valid optimization that you can apply, unlike the one you proposed, which seems to bypass everything in julia.
And Iāll add that even a single field will have the same issue. So one additional minor argument is that Jeff really didnāt like the rule related to circular references even though we proved that it can be decided at type definition time.
And how do you instantiate an object of type A
? You would need an inner constructor that permits uninitialized fields (and whether fields can be uninitialized is part of the type information).
Clear definition: Collapse all concrete immutable always-initialized pointer chains. Whenever you would store x::T
and T
is concrete, immutable, non-isbits, cannot have uninitialized fields and has sizeof(T) == sizeof(Ptr{Nothing})
, then store the contents of x
instead of a reference to an object of type T
with the contents of x
.
Sizes of structs or tuples, whether they are pointer or bitstype, sizes of fields, etc all do not change. Sizes of arrays, and whether they are pointer-type do not change. Only change is that intermediate objects in pointer chains that can be reconstructed from type-information become implicit instead of explicit.
But obviously you know far better how difficult such a change would be!
I am only arguing that this optimization/change would be an unambiguous improvement that would affect a lot of code.
If such an optimization is beyond current dev resources, or too difficult, well, thatās a very good reason for not having it! Or if the reason is āthat would make the C interface too complicatedā, sure. But I completely disbelieve claims like āthis would be an ambiguous performance trade-offā.
Yes?
No itās not. Itās only used for optimization now and not part of any API.
I agree, Iām only saying itās a big and ādirtyā change, and together with all change of similar nature, itās not a good time to do it without doing other changes first.
Thanks for explaining that!
Iāll disagree on ādirtyā, but more interestingly: Could you elaborate on the other changes that should come first?
One issue is what you brought up: whether a field can be null wants to become a proper part of the type information, not just a mere optimization: struct foo x::foo end
is uninhabited / has no instances, i.e. is equivalent to Union{}
, and struct bar x::bar bar()=new() bar(v::bar)=new(v) end
is effectively a peano-style natural number. Both are semantically very different types, and that wants to be reflected in the type information. Thatās just an issue of making this info (that is already accessible) part of the official API, and potentially fixing some user-facing syntax (e.g. type definitions would get an optional @dereferenceable x::T
or @nullable x::T
to make this part explicit for people who donāt want to rely on the compiler to decide whether a field can be nullpointer)
A second issue that you explained to me some time ago is that access to possible nullpointers maybe wants to be handled via cpu trap, instead of runtime checks and throws. Big change for codegen that waits on better llvm support, but I donāt see how this directly relates.
What else are the changes that should be done first?