The key question about this proposal to me (which I don’t see addressed, but may have missed), is what happens when a reference to p
outlives any references to w
? In other words, the GC is doing its thing and there are no references to w
. How do you ensure that w
doesn’t get reclaimed? There’s a bunch of talk about lifetimes and their enforcement, but presumably it’s clear that Julia doesn’t have any lifetime enforcement mechanism (nor even a compile-time correctness checking phase, when such a mechanism would run)—objects live as long as they live and you have to handle it.
For just the field, yes, but that’s what the larger mutable struct can accomplish. If we can’t share the seats or wheels, we can at least share the car. If there is no car and we really are trying to share a wheel and its changes, then Ref
can help.
Which means that there are 2 references that can’t share one of their objects. With other instances or types with inline
fields, that’s more references that can’t share a mutable object. That goes against mutable types’ intention and breaks the guarantee that we can assign anything to a reference. For example, say we instantiate w1 = Whole(Part(), Part())
, a whole with 2 different parts, we’re not trying to share things there. Then we try Whole(w1.part2, w1.part1)
; this will fail for an unprecedented reason. Granted, we have something similar with const
disallowing reassignments; however, they do allow any assignment the first time.
My proposal above would look like
struct FieldRef{T}
owner::Any #for GC purposes only
ptr::Ptr{T}
end
and would rely on the hypothetical new Core.Intrinsics.interior_pointerset(owner, ptr, value)
thing that writes value
to ptr
, while applying the GC write-barrier as if value
had been written to a field of owner
.
That would be relevant for
struct Inner
x::Vector{Int}
end
mutable struct Owner
part::Inner
end
reset!(ref::FieldRef{Inner}) = ref[] = Inner(Int[])
owner = Owner(Inner(Int[]))
reset!(getfieldref(owner, :inner)) #needs a GC barrier if owner is old-gen!
That goes against mutable types’ intention and breaks the guarantee that we can assign anything to a reference.
Indeed, that is why inline
would be a stronger const
. The new rules would be:
- You can assign anything to a reference …
- … except if it’s
const
then you can only assign once during construction - … except if it’s
inline
then you can only assign once during construction and this assignment is a newly created object
I had some time this morning to try my hand at a macro-based solution. I believe that it’s 90% there (with the remaining 90% conceptually straight-forward), and mostly in line with the original proposal.
julia> abstract type AbstractElephant end
julia> struct Elephant <: AbstractElephant
hunger::Float64
end
julia> @with_inline_types mutable struct Forest
elephant::@inline(Elephant) # @inline elephant::Elephant also possible, but less
# MacroTools-friendly
normal_field::String
end
Forest
julia> @with_inline_types mutable struct Biome
forest::@inline(Forest)
end
Biome
julia> f = Forest(Elephant(2), "Zoulando") # TODO: overload Base.show
Forest(2.0, "Zoulando")
julia> dump(f)
Forest
elephant__hunger: Float64 2.0
normal_field: String "Zoulando"
julia> f.elephant
ElephantViewFromForest(Forest(30.0, "Zoulando"))
julia> f.elephant.hunger = 30
30
julia> f.elephant.hunger
30.0
julia> b = Biome(Forest(Elephant(50), "Tasmania"))
Biome(50.0, "Tasmania")
julia> fieldnames(Biome)
(:forest__elephant__hunger, :forest__normal_field)
Nested inline structs work out for construction, but field access is broken ATM. AFAICT views don’t compose, and we’d have to define ElephantViewFromForestFromBiome
. Not very difficult, but a bit of a drag.
@btime
looks good (no allocation) for access. Biome(Forest(Elephant(50), "Tasmania"))
does create a needless Forest
object. It’d need some rethinking to avoid that.
Parametric structs would be significant work.
using MacroTools
using MacroTools: @qq
using Base.Meta: quot
""" Defined for types, not objects. """
static_propertynames(::Type{T}) where T = fieldnames(T)
macro with_inline_types(def)
# Example with respect to this
# @with_inline_types struct Forest
# elephant::@inline(Elephant)
# normal_field::String
# end
di = MacroTools.splitstructdef(def)
di2 = copy(di)
name = di[:name] # Forest
inline_code = Expr[] # code to define the type views
di2[:fields] = [] # [(normal_field, Int)]
inlined_type_fields = Pair[] # [:elephant=>ElephatViewFromForest]
prop_names = Symbol[] # [:elephant, :normal_field]
constr_accessors = []
for (f, type) in di[:fields]
push!(prop_names, f)
if @capture(type, @inline(Texpr_))
T = eval(Texpr) # having to `eval` is unfortunate. there are other ways worth
# investigating, such as `getfield(@__MODULE__, Texpr)`, that would
# have different trade-offs
s_fields = [Symbol(f, "__", sub_f) for sub_f in fieldnames(T)] # [:elephant__hunger]
view_name = Symbol(Texpr, "ViewFrom", name) # technically should also append __f
type_view_code = quote
struct $view_name <: $(supertype(T)) # ElephantViewFromForest <: AbstractElephant
obj::$name
end
function Base.getproperty(view::$view_name, p::Symbol)
# if p == :hunger
# return getfield(view, :obj).elephant__hunger
# end
# error("Field not found")
$([:(if p === $(quot(fieldname))
return getfield(view, :obj).$s_fieldname
end)
for (fieldname, s_fieldname) in zip(static_propertynames(T), s_fields)]...)
error("type ", $Texpr, " has no property ", p)
end
Base.propertynames(view::$view_name) = $(fieldnames(T))
function Base.setproperty!(view::$view_name, p::Symbol, x)
$([:(if p === $(quot(fieldname))
return getfield(view, :obj).$s_fieldname = x
end)
for (fieldname, s_fieldname) in zip(static_propertynames(T), s_fields)]...)
error("type ", $Texpr, " has no property ", p)
end
end
for s_field in fieldnames(T)
push!(constr_accessors, :(getfield($f, $(quot(s_field)))))
end
push!(inline_code, type_view_code)
push!(inlined_type_fields, f => view_name)
append!(di2[:fields], map(tuple, s_fields, fieldtypes(T)))
else
push!(constr_accessors, f)
push!(di2[:fields], (f, type))
end
end
@qq(begin # @qq to get good line numbers
$(MacroTools.combinestructdef(di2))
$(inline_code...)
function Base.getproperty(obj::$name, p::Symbol)
# if p == :elephant
# return ElephantViewFromForest(obj)
# end
# return getfield(obj, p)
$([:(if p === $(quot(fieldname))
return $sub_type(obj)
end)
for (fieldname, sub_type) in inlined_type_fields]...)
return Base.getfield(obj, p)
end
Base.propertynames(obj::$name) = $prop_names
static_propertynames(::Type{$name}) = $prop_names
function $name($(prop_names...))
$name($(constr_accessors...))
end
$name
end) |> esc
end
Your code can’t deal with
julia> @with_inline_types mutable struct BigForest
elephant1::@inline(Elephant) # @inline elephant::Elephant also possible, but less
# MacroTools-friendly
elephant2::@inline(Elephant)
normal_field::String
end
julia> f=BigForest(Elephant(1.0), Elephant(2.0), ""); f.elephant1.hunger #should be 1.0
2.0
Right, that’s what this comment was about:
view_name = Symbol(Texpr, "ViewFrom", name) # technically should also append __f
so
view_name = Symbol(Texpr, "ViewFrom", name, "__", f)
then it would work out I believe, but maybe it’d need some more tweaking.
“Newly created object” is not a simple condition. In Whole
’s inner constructor, we can at least make sure we call its new
at some point. However, Whole
has no idea if Part
was instantiated within Whole
’s constructor call, even if Part
’s constructor is directly called; the Part()
call is not guaranteed to call its own new
, let alone return a Part
instance. Even if we can infer the instantiation of Part
within Whole
, it’s not a sufficient condition. We have to also confirm that it won’t be initially stored elsewhere, like another conflicting inline
field, before any one of Whole
’s fields. As mentioned before, x = Part(); return new(x, x)
would fail for 2 inline
fields, but it could also fail if we instantiate something else that stores the Part
in an inline
field. Avoiding failure requires narrow, brittle, and opaque conditions possibly through layers of callers, and dynamic dispatch can prevent the compiler from resolving this. The compiler also can’t fall back to non-inline
storage without overheads.
The “easy” solution to this is for new
to force a newly created object by copy
ing the instance before inline
-ing the copy, much like immutable instances. We don’t have to care about whether the input was stored elsewhere or if the compiler can tell. It could be possible for the compiler to optimize away the instantiation of the original instance if it can be inferred that it occurs within the outer mutable struct
’s constructor. But if we’re copying things around like this, immutables make more sense. However, updating fields by instantiation and reassignment does have its own issues to work out regarding object identity of nested immutable and mutable structs, it’s not as straightfoward as a.b.c.d = e
currently meaning mutating the d
field of a.b.c
with no (direct) changes to a
or b
. For example, if any layer is mutable, we’d likely want to mutate it there rather than instantiate the whole thing and reassign, e.g. not x = x0 = Ref(1); @reset x[] = 2; (x !== x0, x0, x)
.
I completed the (POC of the) macro-based version this morning. Now it works for nested structs:
abstract type AbstractElephant end
struct Elephant <: AbstractElephant
hunger::Float64
end
@with_inline_types mutable struct Forest
elephant::@inline(Elephant) # @inline elephant::Elephant also possible, but less
# MacroTools-friendly
normal_field::String
end
@with_inline_types mutable struct Biome
forest::@inline(Forest)
end
julia> b = Biome(Forest(Elephant(50), "Tasmania"))
Biome(50.0, "Tasmania")
julia> b.forest.elephant.hunger = 33.0
33.0
julia> dump(b)
Biome
forest__elephant__hunger: Float64 33.0
forest__normal_field: String "Tasmania"
julia> @btime $b.forest.elephant.hunger
2.768 ns (0 allocations: 0 bytes)
33.0
I used generated functions to limit the complexity of the macro expansion. The way it’s written, it might also not be that difficult to support parametric types…
PatrickHaecker Would that meet your needs?
I get the appeal of getting a new keyword in base to solve such a problem, but as you probably are well aware, that does not happen very often. Pushing this macro to the finishing line is more likely to work out… Happy to help if there is interest.
Full code:
using MacroTools
using MacroTools: @qq
using Base.Meta: quot
""" Defined for types, not objects. """
static_propertynames(::Type{T}) where T = fieldnames(T)
get_inlined_type_fields(x) = Val((;))
@generated function get_view_property(::Val{inl_typ_fields}, view::T, ::Val{p2}) where {T, p2, inl_typ_fields}
# When evaluating `b.forest.elephant.hunger`, this is called twice. First time O is
# ForestView{:forest, Biome}, and we return
# ElephantView{:forest__elephant, Biome}(view.obj)
# (simplified) from the first branch below while second time we return
# view.obj.forest__elephant__hunger
# from the second branch. The idea being that each nested View object has, as
# parameter, the corresponding field prefix.
@assert length(T.parameters) == 2
p1 = T.parameters[1]
obj_expr = Expr(:call, :getfield, :view, quot(:obj))
if haskey(inl_typ_fields, p2)
sub_view = inl_typ_fields[p2]
p12 = quot(Symbol(p1, "__", p2))
return :($sub_view{$p12, typeof($obj_expr)}($obj_expr))
else
return Expr(:call, :getfield, obj_expr, quot(Symbol(p1, "__", p2)))
end
end
@generated function set_view_property!(::Val{inl_typ_fields}, view::T, ::Val{p2}, new_val) where {T, p2, inl_typ_fields}
@assert length(T.parameters) == 2
p1 = T.parameters[1]
obj_expr = Expr(:call, :getfield, :view, quot(:obj))
if haskey(inl_typ_fields, p2)
# Eg. this happens for
# b.forest.elephant = Elephant(...)
error("TODO: must copy all fields")
else
# TODO: put a `convert` call to the right type
return Expr(:call, :setfield!, obj_expr, quot(Symbol(p1, "__", p2)), :new_val)
end
end
macro with_inline_types(def)
# Example with respect to this
# @with_inline_types struct Forest
# elephant::@inline(Elephant)
# normal_field::String
# end
di = MacroTools.splitstructdef(def)
di2 = copy(di)
name = di[:name] # Forest
inline_code = Expr[] # code to define the type views
di2[:fields] = [] # [(normal_field, Int)]
inlined_type_fields = Dict() # [:elephant=>ElephantView]
prop_names = Symbol[] # [:elephant, :normal_field]
constr_accessors = []
for (f, type) in di[:fields]
push!(prop_names, f)
if @capture(type, @inline(Texpr_))
T = eval(Texpr) # having to `eval` is unfortunate. there are other ways worth
# investigating, such as `getfield(@__MODULE__, Texpr)`, that would
# have different trade-offs
s_fields = [Symbol(f, "__", sub_f) for sub_f in fieldnames(T)] # [:elephant__hunger]
view_name = Symbol(Texpr, "View")
type_view_code = quote
struct $view_name{F, U} <: $(supertype(T)) # ElephantView <: AbstractElephant
obj::U
end
Base.propertynames(view::$view_name) = $(fieldnames(T))
Base.getproperty(view::$view_name, p::Symbol) =
get_view_property(get_inlined_type_fields($Texpr), view, Val{p}())
Base.setproperty!(view::$view_name, p::Symbol, x) =
set_view_property!(get_inlined_type_fields($Texpr), view, Val{p}(), x)
end
for s_field in fieldnames(T)
push!(constr_accessors, :(getfield($f, $(quot(s_field)))))
end
push!(inline_code, type_view_code)
inlined_type_fields[f] = :($view_name)
append!(di2[:fields], map(tuple, s_fields, fieldtypes(T)))
else
push!(constr_accessors, f)
push!(di2[:fields], (f, type))
end
end
@qq(begin # @qq to get good line numbers
$(MacroTools.combinestructdef(di2))
$(inline_code...)
function Base.getproperty(obj::$name, p::Symbol)
# if p == :elephant
# return ElephantViewFromForest(obj)
# end
# return getfield(obj, p)
$([:(if p === $(quot(fieldname))
return $view_type{$(quot(fieldname)), $name}(obj)
end)
for (fieldname, view_type) in inlined_type_fields]...)
return Base.getfield(obj, p)
end
Base.propertynames(obj::$name) = $prop_names
static_propertynames(::Type{$name}) = $prop_names
function $name($(prop_names...))
$name($(constr_accessors...))
end
get_inlined_type_fields(::Type{$name}) = $(Val(NamedTuple(inlined_type_fields)))
$name
end) |> esc
end
Avoiding failure requires narrow, brittle, and opaque conditions possibly through layers of callers, and dynamic dispatch can prevent the compiler from resolving this.
The general case is hard. Although it would be nice to support all cases, I think this is not necessary, as inline
creates some kind of static context rendering Julia’s more dynamic parts incompatible. In the end the semantics only make sense to me if exactly one new
call of Part
happens, if the reference of this call is returned and if an inline Part
does not have a (direct or indirect) Part
field (a non-inlined one might work, but I am not yet convinced that it should be supported).
If this can be inferred (could the effect system help here?), the call is valid, otherwise it is invalid. And as the new
call needs to act differently and we need to hand the memory address to it, all involved functions on the stack between the inner constructor of Whole
and the inner constructor of Part
need another method with an implicit parameter transporting this information.
But if we’re copying things around like this, immutables make more sense.
I agree. Copying sounds compelling in the beginning, but it needs a strong story about object identity to really make sense with mutables. This might mean that nothing can observe the originally created object which is the source object of the copy and that probably brings us back to the new
approach.
Nice. However,
julia> f=BigForest(Elephant(1.0), Elephant(2.0), ""); f.elephant2.hunger
2.0
julia> typeof(f.elephant2)
ElephantView{:elephant2, BigForest}
julia> typeof(f.elephant2) == typeof(f.elephant1)
false
which brings up what I noted upthread:
I think this would really require new (albeit simple) core intrinsics relating to GC write barriers and a very_unsafe_pointer_to_objref(typ::DataType, p::Ptr) = unsafe_pointer_to_objref(p)::typ
that assumes the type instead of checking (so UB instead of exception if it’s wrong, unless check-bounds=yes
, or some compiler flag for the julia runtime is set).
I’m not sure I get why this is a desideratum. Both are AbstractElephants
, just like array views are AbstractArray
s. It’d be problematic for performance if you wanted to, say, put elephant1
and elephant2
inside a Vector
, but given the context of the post (wants to share lifetime of parent object with child) that seems unlikely.
If you wanted a C-style inlined array, as in
struct Forest
@inlined trees::Tree[4]
end
you could implement a TreeArrayView
where the index is not part of the type.
That said, I can see some of the “Why does this have to be so complicated in Julia” point…
I had some time this morning to try my hand at a macro-based solution.
This is indeed impressive macro work!
I used
mutable struct Part
a::Float32
end
@with_inline_types mutable struct Whole
b::Int32
part::@inline(Part)
end
w = Whole(Int32(42), Part(42.0f0))
to stay compatible with the rest of the discussion (but great that the solution is tested with nested structs)
Note that I have only a limited understanding of the macro implementation, so please correct points I haven’t evaluated correctly, so I can correct the list.
These are the pros and cons I see :
pros:
- clearly defined constructor semantics
- seems to have ideal memory behavior (only one alloc, clear GC story)
- field access works with nested structs
- preserves user defined
getproperty
forWhole
:
Base.getproperty(w::Whole, s::Symbol) = s === :c ? true : getfield(w, s)
julia> w.c
true
- Preserves object identity (although I do not understand why)
w1 = Whole(Int32(1), Part(1.0))
w2 = Whole(Int32(1), Part(1.0))
julia> w1.part === w1.part !== w2.part === w2.part
true
- methods involving
Part
can be called (using aUnion
or subtyping):
cons:
- constructor no longer supports type conversion (
Whole(42, Part(42.0)
does not work) - definition syntax more complicated as macro needed both before struct definition and inside field definition
- returns “wrong” fieldnames:
julia> Whole |> fieldnames
(:b, :part__a)
- returns “wrong” type for substructs:
w.part |> typeof
julia> w.part
PartView{:part, Whole}(Whole(42, 42.0f0))
Part
’s methods needs to accept aUnion
or an abstract type (thenPart
needs to use that, too) to work- Does not support getting the memory address
julia> w1.part |> pointer_from_objref
ERROR: pointer_from_objref cannot be used on immutable objects
So while it does work for the simplest of cases, it seems to be too limited (but I might be using it incorrectly).
Edit: corrections due to additional input (see below)
This seems very similar to how views work on vectors:
f(v::Vector) = sum(v)
v = collect(1:10)
vv = view(v, 2:3)
f(v) # 55
f(vv) # ERROR: MethodError: no method matching f(::SubArray{Int64, 1, Vector{Int64}, Tuple{UnitRange{Int64}}, true})
For views, the solution is typically to use AbstractVector
rather than Vector
for dispatch:
f(v::AbstractVector) = sum(v)
Perhaps Part
and PartView
could similarly both be made to subtype AbstractPart
. I know nothing about macros though, so no idea how to accomplish this.
For views, the solution is typically to use
AbstractVector
rather thanVector
for dispatch
Ahh, thanks a lot. @cstjean used an AbstractElephant
and with that calling methods works:
abstract type AbstractPart end
mutable struct Part <: AbstractPart
a::Float32
end
inc(p::AbstractPart) = p.a += 1
julia> w.part |> inc
43.0f0
So
therefore, no methods involving
Part
can be called
becomes a “pro” point that
- methods involving
Part
can be called
However, cons
gets the point
Part
must be changed in general to support method calls
The original idea was that Part
can be used unmodified.
Nevertheless, with the ability to call methods, this gets much more useful and should now cover all the simple cases. It’s still not on the level of having something like this supported by the language itself. But if already fills a niche where in simple cases the performance with modifications should be superior to what is currently possible with immutable structs. I wonder whether a package might make sense until we (hopefully) will have language support for such a feature.
In your specific case, it may be cleaner to define Union
type directly. It’s a matter of personal preference, I would personally define a macro for each inlined type like:
@Part == Union{Part, PartView}
that can be used in the function signatures so that they only need to be changed slightly. For example,
f(x::@Part) = 5
so one simply would need to add @
in front of the types that support views.
Just a nitpick, but in this case using a macro is not necessary at all, just define a constant, like const UPart = Union{Part, PartView}
.