I am trying to create generic defaults in an immutable struct for a parametric type.
Say that I am trying to create a linked list, like this:
struct List{T}
value::T
next::RefValue{List{T}}
List(value::_T) where {_T} = new{_T}(value, Ref{List{_T}}())
List(next::List{_T}) where {_T} = new{_T}(zero(_T), Ref(next))
end
As you can see, I can always create next because I can just use the default constructor Ref{List{T}}() which works for any T!
But what about T itself? As you can see in the above example, I’m just using zero, because that is defined for numerical types and vectors which is what I mostly care about. But what about general T? I don’t actually want to access .value if it is uninitialized, but I don’t want to have to make this a mutable struct either (giving me undef), because then I can’t statically compile it!
Also, you can’t statically compile Union, so I couldn’t get away with Union{T,Nothing} either.
Basically, how can I create a default or otherwise uninitialized singleton for arbitrary T when zero is not defined?
No, that requires checking whether next is defined. There’s a better way for doing that, without initialization issues:
struct List{T}
value::T
next::Base.RefValue{List{T}}
List(value::T, list::List{T}) where T = new{T}(value, list)
function List(value::T) where T
list = new{T}(value, Ref{List{T}}())
list.next[] = list
list
end
end
When looking for the end of the list, you can check whether list === list.next[]. If they’re the same object, you’re at the end of the list. There’s never an uninitialized field, so you’re preserving type safety and always get a valid List object.
Sorry I wasn’t expecting people to discuss the type itself, I just wanted the List to act as a simpler discussion point.
The actual type is Node{T} in DynamicExpressions.jl:
Which as you can see is a sum-type of four node types: a constant (with value ::T), a feature (with index ::UInt16), a 1-ary operator, and a 2-ary operator.
Right now it’s a mutable struct but I want to make it immutable (with Ref wrapping the children). However for this I will need to have a way of constructing the arbitrary val::T.
@jling I could take the suggestion of assuming the user will pass an example value. But maybe there’s a simpler way?
You cannot construct arbitrary T without already knowing what exactly that T is and how it should be constructed. For example, I can create a type that has no constructors:
julia> struct Foo
a::Int
bar() = 1
end
julia> Foo
Foo
julia> Foo |> methods
# 0 methods for type constructor
So the best option is to have the user pass in the value and let them worry about creating instances of it.
But I’m not accessing it; it’s just occupying space in the struct. Presumably if it is isbitstype I can somehow initialize enough memory and do a reinterpret somehow? But is there really no cleaner way?
Having the user pass it in is the cleanest way. The Foo struct is just an example; I could have just as well made it like this:
julia> mutable struct Foo
a::Vector{Int}
bar() = 1
end
julia> Foo |> methods
# 0 methods for type constructor
which still has no way to be constructed (and certainly can’t be reinterpreted due to the inner pointer).
Even for isbitstypes, you can’t rely on reinterpret to do the work for you (without already knowing what exactly T is and how it is constructed). For example, I can create a new primitive type Baz 8 end where I have defined a secret/black box constructor only available in a shared C library (serving as a blackbox implementation). You cannot get a valid instance of that type because you cannot know the invariants its bitpattern must observe when constructed (that’s the secret part) - you need to go through that ccall, and reinterpret doesn’t do that.
Another issue here (besides practicality of copying around an example val to all my constructors) is that I would need to copy the val::T for every single node I create (lest having all branch nodes point to the same val which would make distributed c). This could end up being a huge allocation.
Is there no workaround here? I could just have this field be Ref{T}(), but that seems like an ugly workaround.
Is there no way I can just tell Julia to ignore that field in a struct?
struct Node{T} <: AbstractExpressionNode{T}
degree::UInt8 # 0 for constant/variable, 1 for cos/sin, 2 for +/* etc.
constant::Bool # false if variable
- val::T # If is a constant, this stores the actual value
feature::UInt16 # If is a variable (e.g., x in cos(x)), this stores the feature index.
op::UInt8 # If operator, this is the index of the operator in operators.binops, or operators.unaops
l::RefValue{Node{T}} # Left child node. Only defined for degree=1 or degree=2.
r::RefValue{Node{T}} # Right child node. Only defined for degree=2.
+ val::T # If is a constant, this stores the actual value
end
and then I can create nodes without the val by simply only passing the first 6 arguments. But this seems pretty arbitrary…
You can, if that val field is at the end of the struct, using the incomplete initialization that @CameronBieganek mentioned above. It just won’t be useful to you, since there is no actual value stored there; #undef means exactly that there isn’t an instance; there is no valid object there (even though the field claims to have type T, #undef is not a T!).
If you want incomplete initialization, then you’ll have to deal with the consequences of #undef Note that a Ref doesn’t save you from that; if a user ever gets the idea to access the Ref before actually storing a value, they’ll get the same error as if they had accessed an uninitialized field:
The reason for this is that just having the right amount of memory allocated is not the same as having an actual object stored in that memory. If you want that field to always contain a valid object, that field needs to be set before the constructor returns, either through mutation or by passing a valid object to new.
I mean, nothing mentioned so far prevents static compilation in principle, so I’m not exactly sure what you’re referring to here. You can statically compile mutable structs just fine, I’m guessing you’re more worried about a potential heap allocation? That’ll likely happen either way if T is not a bitstype.
You’ve mentioned at the start that you don’t want to access .value if it’s not initialized - but that’s exactly what #undef is. It’s uninitialized memory, so if you ever forget to check before accessing that field and it’s not initialized, you’ll get that UndefRefError. If you want to be type safe there, you’ll want to make sure that this is always a valid instance of a T, and that means having the user worry about constructing it and let them pass it in (or use a Ref and punt the #undef there, talking sternly to your users that it’s their responsibility to fill it).
Thanks. I should specify I know I will never access .val if it’s not initialized. This is because Node is a sum type and you can infer .val is not initialized by reading the value of .degree and .constant (which are initialized).
My point is that Node (to Julia!) is not a sum type. It’s a regular product type, as all structs are, and the language does not help you in any way to preserve that invariant coupling between fields. If there’s ever a path through your code that accesses .val without properly checking the other fields first (or, by some third party just accessing the field with the wrong assumptions), you will sooner or later hit that UndefRefError. The safest option is then to make sure it’s always valid, by either requiring it to be passed in or accepting the type instability and representing the “uninitialized” state with a different type from the “initialized” state.
The general idea for this sort of transition (from ad-hoc, loosely enforced couplings to type-safe abstractions) is called “Making illegal states unrepresentable”. In Julia, at the moment that unfortunately means type instability, because we don’t really have the tools to properly enforce these invariants.
Exactly what I was thinking, but even for numeric types, if restricting to those:
I wouldn’t rely on that (rather one?) for all types, e.g. Rust has NonZeroU8 in std::num - Rust (to make division by zero impossible…), and I’ve been thinking if Julia should have such a type, at least available, if not in Base.
julia> one(String) # wasn't sure what to expect, or that zero not defined
""
julia> zero(String)
ERROR: MethodError: no method matching zero(::Type{String})
There’s a good recent video (or at least to me) about “constructors broken” in C++, vs in Rust (i.e. factories better), and I meant to post a link, think more if it applies to Julia too.
It’s usually the wrong structure (while your problem more general to all containers, also likely for below), and famously hard in Rust…: