Julep: Efficient Hierarchical Mutable Data

TL;DR

This proposal outlines a method to achieve space and time-efficient writable substructures in Julia. The memory layout is identical to that of immutable structs within mutable structs. This approach allows to express that two objects with object identity have identical lifetime. It integrates seamlessly with Julia’s existing language mechanisms.

The proposal requires the introduction of one additional (or reused) keyword before fields and an extension of the new syntax semantics. The proposal’s effectiveness is further enhanced with the introduction of a new Memory type.

I apologize for the length of this post. If this is not the appropriate place to store the proposal, please direct me to the correct location.

Goal

I would appreciate any feedback and improvement suggestions. Since this proposal likely affects core parts of Julia, only a few people might be able to drive this change. If there are no major obstacles and the idea seems worthwhile, I am interested in knowing whether one of the core develops could consider pursuing this idea.

Motivation

My team uses Julia for real-time signal processing. We love using Julia to quickly prototype ideas and then gradually refine the successful ones until they are real-time capable, without needing to port them to another language.

Our main data structures often consist of multiple levels of nested structs, with some fields (both leaf and non-leaf) being arrays. These data structures are typically created at initialization time, i.e., after system startup and before entering the real-time phase. During the real-time phase, efficiency is key. This includes having an optimized memory layout for cache locality and SIMD, efficient interoperability with C, and avoiding allocations.

Current Status

For isbits types, StaticArrays is great. However, most of the values in our data structures need to be modified during the real-time phase. This works well when StructArrays is appropriate, for small immutable leaf structures by creating a new instance, and for slightly larger types with Accessors.

However, performance degrades significantly when the data structures are larger, even with Accessors. Using a mutable struct or an Array is often not an alternative, as it adds an indirection via the reference, severely reducing cache locality. Additionally, it often disrupts C interoperability by changing the memory layout and not inlining the (sub)structure or array into the parent struct.

Earlier Attempts

@andyferris implemented modifying non-isbits types in MArray to allow modification of an array where the references are stored inside of a struct. However, he had to remove the implementation, as the conclusion was that there is no way to implement it correctly.

@jameson’s Julep to modify immutables when part of mutables would have helped, but he closed it as not planned. While I got the impression that the language semantics would be sound by effectively making a previously immutable part of a mutable struct, I am not yet settled on whether it is a good idea to effectively make an immutable type mutable in this context. This is because immutables types are not designed with mutability in mind, which might result in missing modification methods like setindex or by violations of constructor invariants.

Semantics

Julia provides ease and expressiveness, not just good performance. Instead of looking for an efficient implementation, let’s provide the means to adequately describe the problem so that the compiler can create efficient code.

The earlier attempts seem to have focused on the ability to modify immutables (see above). Yet, the defining property of mutable structures is that they have object identity. Many of the larger types (as in Base.summarysize) have object identity. I suggest first focusing on mutable composite types and then checking later whether there remains a need to extend immutables inside mutable types.

Problem Statement

Let’s define two types Part with object p and Whole with object w. Both types have object identity (= mutable struct). Part makes sense on its own and as part of other types. One of these other types is Whole. We want to express that p is coupled with w for their whole lifetimes. Currently we cannot express this.

With

mutable struct Whole
    part::Part
end

we can express that p and w

  • are created in any order
  • are (weakly) coupled temporarily when p is referenced by w, beginning either when w is constructed or later by setfield!(w, :part, p), and
  • finalized in any order, either when still (weakly) coupled or after a call of setfield!(w, :part, p_other), with p_other being a different object of Part, and both objects (happily) living their object life.

By adding const as in

mutable struct Whole
    const part::Part
end

we can express a stronger coupling, i.e. that

  • p is created before w,
  • p and w are still coupled at the end of the life of w and
  • p can outlive w.

While I am happy for p, this does only help so much in expressing our problem.

Proposal

So we don’t tell the compiler the exact problem and wonder why it does not find the best solution? Let’s fix it and define a keyword which expresses that the lifetimes of p and w are fully coupled, i.e.

  • p and w are created at the same time,
  • p and w are always coupled during both of their lifetimes and
  • p and w are finalized at the same time.

I can imagine this keyword to be coupled, static, inline, assimilate, or something similar. If it needs to be an already reserved keyword, local and import seem to be possible choices, though ignoring their usual meaning might not be the best idea. Let’s use inline as a placeholder in the following examples. So this would look like

mutable struct Whole
    inline part::Part
end

That wasn’t too difficult, was it? But what does it mean? Just adding a keyword, i.e., increasing code length, doesn’t bring much benefit unless you are paid by the number of characters in your code.

Memory Layout

As the lifetimes are now fully coupled, p and w do not need two different memory regions with a reference connecting them. So instead of two smaller allocations we guarantee to have one larger allocation, which is more efficient. Nice.

But we should not stop here. As p and w are now part of the same memory region anyway, the memory of p can be put inside of the memory of w exactly the same as it would be done if p were immutable.
This is how this would look like:

mutable struct Part
    b::UInt16
    c::Int16
end

mutable struct Whole
    a::Int32
    inline part::Part
end

As this is guaranteed to be inlined, sizeof(Whole) will return 8 (instead of 16 on x86-64-bit due to the pointer and alignment).

Object identity

But mutable composite types also need to have stable memory addresses. Does p have a stable memory address? Sure. It can be easily calculated by

pointer_from_objref(w) + fieldoffset(Whole, :part)

using

Base.fieldoffset(x::DataType, sym::Symbol) = fieldoffset(x, findfirst(:part |> ==, Whole |> fieldnames))

As fieldoffset is constant per type, it follows, that object identity of Part is preserved as for Whole objects w1 and w2 the following is valid:

addr(w1.part) === addr(w2.part) ⟺ addr(w1) === addr(w2)

with

const addr = pointer_from_objref

Lifetime

Additionally, tracking the lifetime of objects is crucial. Conceptually, this should be easy, as either every usage of p counts as a usage of w and lifetime ends when w can no longer be reached, or usage of p and w is attributed separately and lifetime ends when both p and w can no longer be reached.

Abstract types

Obviously, inline will only be defined if Part is sufficiently concrete. I don’t know where to draw the line exactly, but allowing everything which has sizeof defined might make sense. Disallowing isbits types could make sense, as the inline would be ineffective and we probably want to avoid cargo cult programming which puts inline everywhere.

Relationship to const

As shown above, inline is a stronger const and in that sense, they are mutually exclusive. On the other hand, const with immutable structs makes the fields of that struct constant in the context of the defining struct, which is a useful property. I see no reason why it should be a different case for mutable structs, therefore

mutable struct Whole
    inline const part::Part
end

should behave like the const immutable struct pendant.

Recursive Definition?

In

mutable struct C
    d::Int
end
mutable struct B
    c::C
end
mutable struct A
    inline b::B
end

should c also be inlined in a = 42 |> C |> B |> A, i.e. should the keyword work recursively, i.e. should c (and therefore d) also be inlined into the memory region of b (and therefore a)?

Coupling lifetimes of a and b does not logically couple lifetimes of a and c (or d for that matter). Staying with the tradition that the answer of a question in a header is always “no”, that’s why inline should not work recursively. Additionally, const does not work recursively (which you can easily try yourself), so this is the consistent solution. Even though deep inlining is probably wanted in the majority of cases when using inlining, it can easily be created by defining additional types (as done with const), e.g.,

mutable struct StaticB
    inline c::C
end

and use that one instead in A. All methods of B can easily be reused by StaticB if they are defined for an abstract super type which B and StaticB inherit from. If the definition of both types gets tedious, a macro can create the non-static type based on the definition of the static type.

Object creation

Up to now, we have assumed that the objects just exist. Now let’s examine how to construct them. All objects ultimately begin their existence with a call to new, as even an outer constructor ultimately calls an inner constructor, which might be the default constructor.

Before proceeding, let's quickly recap how this is currently done to appreciate the new challenge.

Mutable Type

If Whole contains a (weakly) coupled Part as in

mutable struct Whole
    part::Part
end

p will be created sometime before w is actually created. This is true even if the creation of p and w are packed as tightly as possible:

Whole() = new(Part())

Although p is created immediately before w, technically the creation of p somewhere in memory is completed first before the creation of w begins somewhere else in memory, independently of p. This approach cannot work for our proposal as it does not align with the memory model.

Immutable Type

If the existing construction of mutable types cannot be reused for our proposal, let’s consider whether we can reuse the approach taken for immutable types. Let’s temporarily use an object i of an immutable type Immutable in Whole:

mutable struct Whole
    immutable::Immutable
end

Regardless of whether the i to be saved in immutable is constructed before the call of the Whole constructor, inside of any inner or outer constructor, or inside the new call, i is conceptually created before w. However, in this case, i is copied into w when w is constructed, as w lacks object identity.

Since we want to address a w with object identity, we cannot reuse this approach for our proposal.

New Concept

The concrete types of all inline fields are known. However, there can be multiple inline fields with the same type. We need to ensure that the coupled object creation time is managed correctly and that we do not alter object identity implicitly.

There are several options for how the user interface for creating mutable structs with inline fields could look like. Let’s group them by control flow.

Defining p before creating w

We can define the creation of p before the creation of w if we either explicitly or implicitly delay the creation of p until w is created. The compiler needs to ensure that p is not used before w is created. This can be done implicitly by backtracking the arguments to new that correspond to inline fields until their creation. Alternatively, it can be done by annotating the call that creates p and forward tracking the inaccessible variable. A more Julian approach might be to handle this explicitly,m either using a closure or by creating a special type like LazyConstructor that wraps the constructor call of p.

All the implicit solutions would need to be limited in scope, e.g. limited to the currently active call of the Whole constructor or to any constructor call of Whole.

All approaches allow for quite some flexibility. But the question is whether the resulting complexity is worth it.

Defining p when creating w

By only allowing the constructor call of p inside of the new call of Whole we lose some flexibility, but the logic becomes simpler. The compiler can request the additional memory for Part in the new call of Whole, then follow the constructor call and handle the new call of Part specially by not requesting additional memory but using the already allocated memory.

As the inlined field’s type is specific, it would be possible to make this changed behavior explicit by, for example, using new instead of Part or just providing the Tuple of the arguments. However, this approach might be more confusing than helpful.

A constructor can call new more than once.

mutable struct Part
    a::Int
    Part() = (new(41) |> println; new(42))
end

It needs to be clarified whether this should be forbidden when used as an inlined field or whether the compiler needs to track the call of the constructor’s return object.

In any case, it is important to ensure that the construction of p remains efficient. Would some kind of implicit dispatch/specialization on inline versus non-inline in the constructors and/or the new call work? This could involve either an additional keyword or just a different function name.

Defining p after creating w

It is also possible to avoid defining p inside the new call in Whole by either leaving the argument out or using _ for the inline field argument. In this case, w.p would need to be assigned by a constructor call (or optionally by some kind of copy) before the constructor ends. This assignment can occur either in the current constructor call or in any constructor call of Whole on the stack.

Suggestion

It might be best to ensure that p is created by a call within the new call of Whole. If the user attempts something too complex, such as creating multiple objects in the constructor of Part, Julia can generate an error indicating the issue.

Arrays

The above functionality is already excellent and works on its own, but it becomes even more powerful when arrays support the new mechanism. To achieve this, we likely need a new type StaticMemory{L, T} (temporary name) as the static version of Memory{T}, where L represents number of elements and T represents the type.

Memory has a length field per object, while StaticMemory stores its length in the type. You can think of Memory and StaticMemory{L, T} being defined as follows:

mutable struct Memory{T}
    const length::Int
    const ptr::T[] # pointing at the position after the object
end

mutable struct StaticMemory{L, T}
    inline data::T[]
end

This pseudo-code uses pointer/array syntax. StaticMemory does not point to an object of type T; instead, it includes them in its own memory.

Building on this type, other types like WVector (a writable vector similar to MVector) that support modifying non-isbits types can inline a StaticMemory object on their own. This is possible because they all have object identity and are thus covered by the proposal.

As a beneficial side effect, you can create an Array type that requires only one allocation, compared to the typical two allocations of, for example, Vector. However, the type is limited to a fixed size or, at least, cannot grow beyond a fixed size.

In this context, the proposal sits between FixedSizeArray (which builds on Memory) and MArray (which builds on Tuple). With this additional proposal, there are three basic memory types in Julia:

Memory Types Length Mutable
Memory Run-time Yes
StaticMemory Compile-time Yes
Tuple Compile-time (mostly) No
16 Likes

Wouldn’t it be possible to resolve the issue with a macro? For instance, if a hypothetical macro @inline would transform a struct from:

@inline mutable struct Whole
    a::Int32
    part::Part
end

to a struct:

mutable struct Whole
    a::Int32
    _b::UInt16
    _c::Int16
end

and also adds getindex and setindex methods so the following syntax would work:

w = Whole(2, Part(1, 4))
# Individual elements could be modified as
w.part.b = 3
# Wheras the part could be attained in constructing:
part::Part = w.part[]

However, I don’t know about the allocations when constructing the part::Part. Perhaps there is a way to build Part in an unsafe way by doing some pointer arithmetics and, in that way, avoiding them.

3 Likes

Wouldn’t it be possible to resolve the issue with a macro?

I was also thinking about macros first, but I haven’t found a solution and therefore dismissed it. But maybe I was too fast.

The typical access of a single isbits value should indeed be possible, because nested structs are then basically nested namespaces and they can be flatten with something like Symbol("part.b").

However, I haven’t found a solution for constructing these. In general you need to support all combinations of all possible constructors of all (recursive) substructures. I wouldn’t know how to do this with macros, but maybe someone else knows.

But even if that is solved, there will be methods which expect a Part object, so that would need to be reconstructed. But if it needs to be a type sometimes anyway, why not keep it as a type all the time?

Perhaps there is a way to build Part in an unsafe way by doing some pointer arithmetics and, in that way, avoiding them.

In my quick tests the manual pointer arithmetic of the nested structs was significantly slower as using the flattened struct. Maybe I did it wrong, but maybe the optimum performance can only be achieved with language support.

This for me seems like a solvable problem, although not an easy thing to do. I personally have always been frustrated writing macros, sometimes LLM like ChatGTP or Claude helps to put together foundation.

IMO this is the most important point then. I am also unexperienced on theese things so perhaps someone else can help here.

What about something like MArray from StaticArrays.jl? That is, take a dynamically-sized array and annotate it with type domain information on sizing.

Alternatively, FixedSizeArrays.jl could relax its size field from NTuple{N,Int} to any subtype of NTuple{N,Integer} (or maybe even go with Number instead of with Integer), then one of the various “static int”/“type domain int” types defined in various packages could be used to describe the size of any dimension. Related issue: more options for `size` · Issue #54 · JuliaArrays/FixedSizeArrays.jl · GitHub. NB: this would mean it’s possible to describe both dynamically- and statically-sized dimensions in the same type, like with HybridArrays.jl.

The compiler developers are working on improving Julia’s escape analysis, something that could lead to heap allocation being replaced by stack allocation much more often than is currently possible: wip: overhaul EscapeAnalysis.jl by aviatesk · Pull Request #56849 · JuliaLang/julia · GitHub

Thanks, @nsajko, for your input and the correction above (learning: never start to write code in a tool with auto-correction :-)).

What about something like MArray from StaticArrays.jl? That is, take a dynamically-sized array and annotate it with type domain information on sizing.

As MArray is statically sized (source, documentation) I assume you were referring to SizedArray from StaticArrays. Although SizedArray is immutable, and therefore will be inlined, its data field containing the AbstractArray, e.g. a Memory, will reference the AbstractArray as that is mutable:

julia> Memory{Int} |> ismutable
true

Alternatively, FixedSizeArrays.jl could relax its size field from NTuple{N,Int} to any subtype of NTuple{N,Integer}

This is definitely a great idea and could maybe be expanded for the general array interface starting from Memory all the way up (at least if it is possible without compromising compatibility). And while this will be the best solution for some cases where you know the maximum number of elements, it will not solve the inlining as FixedSizeArrays again uses a Memory as you are obviously aware of.

The compiler developers are working on improving Julia’s escape analysis

This sounds very promising, but would need something like LTO (i.e. full program optimization) with a guarantee that the reference is removed during optimization to ensure the C compatibility when including a mutable struct. However, as other types need the reference/pointer for C compatibility I think there is just no other way than telling Julia which of both is intended.

1 Like

Why not guarantee this to be inlined without any new keyword asking for it?

Wouldn’t that solve your problem, or would you then want a way to opt out of this?

I can see the status quo might be easier on the compiler and this would make for longer compilation times(?). Does this need to be a guarantee? Maybe only in non-debug mode, but we could have faster debug mode or lower optimization for the status quo.

This would also be recursive; and I can foresee we might want to allow the compiler to change order of (those super) structs, as is allowed by Rust. Then it makes it more difficult,but that’s a bigger independent change.

Julia does already inteprocedural optimisation (IPO), link-time optimisation (LTO) is a form of IPO.

3 Likes

Alternatively, FixedSizeArrays.jl could relax its size field from NTuple{N,Int} to any subtype of NTuple{N,Integer}

This is definitely a great idea and could maybe be expanded for the general array interface starting from Memory all the way up

I just checked: For non-statically sized arrays (= number of elements is not determined by the type) this is already the case as getindex and setindex! already support Integer.

So supporting Integer and not just Int affects probably mostly statically sized arrays and their type parameter(s). And as AbstractArray supports Integers, you might even argue that due to consistency reasons, this is even mandatory to fully cover the array interface.

I might be interpreting the design incorrectly, but I’m not a fan of inlining or coupling mutable types like this. You already hinted at it with object identity and comparing with the layout of immutables inlined in mutables, but the primary characteristic of mutable types is multiple references “observing” a mutation. Because some of the references don’t outlive local scopes or some objects with references can be freed, a mutable instance generally must be allocated independently on the heap instead of the stack or inlined to another object.

By coupling objects like this, you’re sacrificing the free-ability around said objects with references. With the inline proposal, you’re making w live as long as p does, even if you only need p. Because of Part’s object identity, you can’t extract the same instance from the Whole. In the examples of Whole containing little besides Part, that’s not a big deal, but if Whole contains a lot more data like a huge array, you could be using up a lot of memory just for a small Part. This is similar to a common gotcha with views of arrays; while it’s faster and sometimes only correct to make a view, a view instead of a copy keeps the whole array alive, even if you only need a small fraction of its elements. Along with other performance reasons, indexing Arrays thus defaults to copying instead of views, and we must view with care.

vtjnash’s proposal is one version of a generally accepted view that mutables are actually kind of overrated. When we need (or it’s convenient) to have multiple references that observe a mutation, mutable types are appropriate and useful. But if we only need to update data in one place, reassignment of a different immutable instance to one reference works just fine e.g. count+=1. Accessors provides the syntax for doing this with changes to nested fields or elements, but as you said, it hurts performance to go through constructors. While possible for Accessors to skip constructors to new instantiation, we’re still relying on the compiler to optimize it to a simple field update. In the extreme case where we can instruct Julia to update fields to accomplish semantic instantiation and reassignment, we wouldn’t even need named mutable types sometimes, we could just reassign things to a Ref instance.

4 Likes

Why not guarantee this to be inlined without any new keyword asking for it?
Wouldn’t that solve your problem, or would you then want a way to opt out of this?

Indeed, inlining and non-inlining needs to be available both for C compatibility and for performance reasons. Let’s look at the following C code:

struct part {
    int c;
    char d;
};

struct whole_inline {
    int a;
    struct part p;
};

struct whole_non_inline {
    int a;
    struct part* p;
};

Both whole_inline and whole_non_inline can exist and need to be supported for C compatibility.

Regarding performance: If part is part of a continuous memory access pattern, inlining helps. If it is not (and sufficiently large), it might be better to just store the pointer and get the memory of part out of the way of the continuous memory access pattern (better having a gap of 4 or 8 bytes than a larger gap).

I can see the status quo might be easier on the compiler and this would make for longer compilation times

The proposal might be implemented by dispatching/specializing each constructor on Union{Ptr{Part}, Inline, NoInline}.

  • Inline: Allocate a larger memory region (as in Base.summarysize). This logic should not be slower than the existing logic for inlining mutable structs. Some additional accounting might be needed, though.
  • Ptr{Part}: Do not allocate but reuse the provided memory pointed to by Ptr{Part}. Some accounting might be different or might be obsolete.
  • NoInline: Same behavior as now.

Therefore, if no inline is used, compilation speed should be mostly unaffected. If it is used, especially with mixing inlining and non-inlining, more constructors need to be compiled and that will take additional time (as always when using more dispatch/specialization).

I think this is a valid tradeoff: More runtime-optimization can lead to longer compilation time.

I can foresee we might want to allow the compiler to change order of (those super) structs, as is allowed by Rust

This is a compelling idea. However, due to C compatibility (or to anything else) this would need to be optional and would indeed be a bigger independent change.

I was discussing the size field of FixedSizeArray, you seem to be (?) discussing whether getindex allows non-Int indices, the two are unrelated. So I don’t understand you completely.

[…] a mutable instance generally must be allocated independently […]
By coupling objects like this, you’re sacrificing the free-ability around said objects with references. […]
Along with other performance reasons, indexing Arrays thus defaults to copying instead of views, and we must view with care.

I might not have stated the proposal well in this regard, because I think we agree here. I really like your analogy with array views. There are many reasons why coupling should not be the default, because in the general case it is the wrong model of reality.

However, if you know that you do not have a generic object relationship, but a special one with the property of coupled lifetimes, we should be able to express this property and reap the benefits it includes. This is the same as with array views which are useful if the lifetimes are coupled and can be hindering if not.

That’s why the feature should be opt-in (using the keyword) and by default off (current behavior without the keyword) as it is done in the proposal.

While I agree that the false conclusion “we need to modify something therefore it should be a mutable type” is seen too often, I think that defining whether sharing should be done is very important. I think this is well stated in Bertrand Meyer’s Object-Oriented Software Construction, chapter 8.1, section References.

I agree that with perfect compiler optimizations of writes to object fields of immutable types we could avoid mutables and use Ref instead. But what would we gain?

Without other changes we’d have a really convoluted @set call with the manual dereferencing, we would need to reinvent object identity around Ref and we would need to argue why the recommended way is to mutate an immutable (to be fair, we could just never mention mutable and immutable anywhere, so the last point would be easily solvable).

I think we’d need a strong case that object identity / sharing is an instance property (each instance can be used with Ref or without) and not a type property (which is currently the case). Additionally, we would need convenient setindex! syntax for immutables and Ref.

Without these, the above proposal seems to me like a more natural way to describe problems to the compiler (also known as programming :-)).

2 Likes

I was discussing the size field of FixedSizeArray, you seem to be (?) discussing whether getindex allows non-Int indices, the two are unrelated.

Sorry for the confusion. I initially thought that both size type information (as a type parameter or field) and size access information (in getindex and setindex!) would be affected. I just realized, that only the former one is relevant and we can ignore the latter one as great people already solved it.
In short: I share your view. :slight_smile:

1 Like

By coupling objects like this, you’re sacrificing the free-ability around said objects with references.

Thinking a bit more about it, this might hint to the reason for the confusion. I think depending on the domain or problem space we are currently working, coupling is a very common case or something bizarre and sacrificing free-ability is a no-go or no big deal.

Doing something heavily math-related, object identities are probably nothing you think too much about. And doing something with a lot of dynamics and therefore short-lived objects, reducing free-ability does not seem like a good idea at all. These domains exist and this proposal might not bring a lot to the table for those.

But there are the other domains, too. Like in real-time signal processing applications, where you need to avoid allocations for most of the lifetime of your process (or at least your real-time thread). There, you create most of your objects at the beginning and the coupled lifetimes are the default and not the exception. There can still be a lot of object identity involved.

Or assume you want to model some real-world objects like cars where you want to support that the engine gets replaced but you do not want to model that the seats are replaceable. So it’s totally fine to couple the lifetime of the seats to the car, but you still want to be able to modify the seats (e.g. their position, heating, cooling, …) without compromising on object identity (the left seat is not the right seat even if the object properties are identical).

1 Like

^this!

What I’d really like is language constructs that are not relying on the compiler to optimize, but instead guarantee that. Otherwise, we must always check whether the compiler got it right.

For that reason, a kind of inline keyword for struct members would be nice, with semantics “compiler either successfully inlines the field into the struct or throws a compilation error/warning” (obviously this only works if the field has a concrete type and is immutable). From that, one would need an update function a la setfield, that is guaranteed to either throw a compilation error/warning or emit the right pointerwrite.

The only new language feature here is that the compiler complains if the desired features are not realized. I.e. this can be considered as just a linter hint :wink:


One of the questions OP’s example don’t answer is:

Suppose you have

mutable struct Whole
part1::Part 
part2::Part
end

OP now wants, basically, a kind of FieldRef{Part}, i.e. an interior pointer to Whole that permits updating of the part.

It is trivial to implement FieldRef{Whole, Part, (:part1,)} and recursively FieldRef{Whole, Int, (:part1, :someIntFieldOfPart)}, etc. (the fieldref would have only a single field of type Whole, the rest is stored in type information)

The thing that makes this a new language feature is the desire that typeof(fieldref(w::Whole, :part1)) === typeof(fieldref(w, :part2)), i.e. that the type does not imply the offsets from the parent object.

And I’m very unsure how that kind of language feature would fit into julia.

Maybe it’s enough to have a new primitive Core.Intrinsics.interiorPointerSet(parent, offset, value) that writes value to pointer_from_objref(parent)+offset, has the right aliasing inference, and includes a GC write barrier on parent? (i.e. do the necessary marking if value is young generation and parent is old generation)

That would not be an overly complex primitive. Maybe one could learn from the java people and do that as Core.Intrinsics.Experimental.interiorPointerSet? From that, packages could implement experimental fieldref types, using lots of generated functions (and it would be up to the package to throw in the generator if somebody does something stupid like creating a fieldref to something with a Union type, or that isn’t inlined as expected).

1 Like

No?

As I see it, it’s a performance issue yes, and you could to either always, or even let the compiler decide (it’s an optimization problem which is faster). You can get a pointer to p either way, for using it in C. Yes, it matters which if you want to get a pointer to the full struct. Would it be enough to know, to ask the compiler what it did in many cases, if it’s not forced to always inline?

My point still stands, when do you really NOT want it inlined (only very rarely?), or even better, when do you want to control (I suppose many Julia/scientist programmars wouldn’t want to, or know about the issues), i.e. not just let the compiler control it? It could be I guess a hint to the compiler with a macro, not needing a new keyword.

1 Like

In that case I would prefer that Whole is wholly responsible for the lifetime, not Part. That again lines up with immutables inlined into mutables, and new instantiation and reassignment being optimized to a field update would accomplish the same thing as mutating inlined mutables. Inlining mutables is in a way the opposite direction of Ref-ing immutables, but the latter seems like an easier option to deal with.

If you really need otherwise equal seats with different object identities, a field for an identity value would work for immutables. More to the point, note that mutable objects with equal values have different object identities because of their stable and independent memory addresses. You would sacrifice that independence by inlining. If you are just mutating each seat, then it would seem as if the different addresses of the seat’s fields would suffice. But consider that someone might want to instantiate a new seat with the left seat’s values and swap out the left seat of (mutating) the car; memory address would fail to recognize the newly installed left seat as different from the older left seat when it was still in the car, and there’s no good way to treat the identity of the older left seat after its removal. Of course, you addressed this by describing inline as a more extreme coupling than const, which already doesn’t allow said swapping, but a non-const inlined immutable is evidently more flexible by letting the mutable struct mutate a field.

It also occurs to me that inlined mutables with object identities still tied to memory address would make it impossible for 2 fields to be assigned to the same instance, that’s how important independence is for a mutable object to have multiple references.

1 Like

Yes, it matters which if you want to get a pointer to the full struct.

Either a pointer or if the whole object is called-by-value. As the memory layout is different it’s necessary that inlining can be defined by the user. However, I agree, that most Julia users won’t bother and a hint to the compiler with a macro would be fine.

Would it be enough to know, to ask the compiler what it did in many cases, if it’s not forced to always inline?

No, that would not be enough, as “sorry, we are incompatible with the C library we need to interact with, but luckily I wrote pages of manual memcopy code as a fallback solution just in case, am happy with maintaining duplicate structure field information, and wasn’t really interested in good performance anyway” does not sound like a realistic scenario. :wink:

when do you really NOT want it inlined

That depends on what we are talking about exactly. If the reference gets modified or if p exists way before w, inlining is obviously not possible. That said, I assume in all cases where inlining is possible (the whole program is sound with identical results independent of whether you inline or not), inlining will be in the majority by far and not inlining will be the exception.

It’s again only memory access pattern and certain C compatibility cases where not inlining is then the right decision (but there might be other cases which I do not think about currently).

A solution where the compiler is free to do whatever it wants regarding inlining and two macros where one would guarantee inlining and the other would guarantee non-inlining would work, too.

In that case I would prefer that Whole is wholly responsible for the lifetime, not Part.

I agree this sounds like the better concept. I am just not sure about the internals (but you probably are). My thinking: An object of a mutable type being managed by another object (merged object) might or might not be close to current implementations. That’s why I didn’t want to close the door for other implementations.

But consider that someone might want to instantiate a new seat […]
but a non-const inlined immutable is evidently more flexible

Indeed, this is more flexible when the seats should be replaceable. However, if you do not need replaceability, you can save one value for a separate object id and get the super fast setfield with the inlined mutable type. It’s a trade-off. In some cases you want to have the flexibility and in other cases you want to limit yourself on purpose and save on memory and runtime.

Please note that for other cases the inlined mutable is more flexible than the non-const inlined immutable as you can use sharing for the former ones, but not for the latter ones. This allows using all the useful functions with an exclamation mark in the name.

I think both have their advantages and disadvantages and therefore we should have both.

It also occurs to me that inlined mutables with object identities still tied to memory address would make it impossible for 2 fields to be assigned to the same instance, that’s how important independence is for a mutable object to have multiple references.

In

mutable Whole
    inline part1::Part
    inline part2::Part
end

part1 and part2 could not both contain p, that’s correct. However, in

mutable Whole
    inline part1::Part
    part2::Part
end

that would be possible although I have a hard time imagining why one wanted to do this (but you didn’t imply this).

Is this the case you have in mind?