Getting MutableNamedTuple keys from an empty array

Hmm,

I have an empty Vector of MutableNamedTuples with a specific signature, ie if the array has a single element, then I can recover the keys of the underlying NamedTuple by inspecting the element.

But what if the array is empty ? The information appears to be there, since eltype() will indicate the mutableNamedTuple signature, but I don’t know how to access the keys from there.

The extra level of indiraction is part of the problem. A MutableNamedTuple is declared as :

struct MutableNamedTuple{N,T <: Tuple{Vararg{<:Ref}}}
    nt::NamedTuple{N, T}
end

Calling something like

getfield(eltype(empty_array), 1)

returns T<:Tuple{Vararg{Ref}}, which isn’t what I want.

Is there a way to get the keys of the underlying NamedTuple in a straightforward manner ? I’ve not been able to find a related topic, but I might not know the right keyword.

(I imagine it’s possible by inspecting types in the AST, but that seems like overkill for my simple needs).

The reason I encountered this issue that I wanted to preallocate a Tables-like structure of these Tuples, which needs the key names, but that table might grow and I don’t want to have to check indexes with the preallocated values, so would rather keep the table empty and sizehint! it. Of course, I could just preallocate, declare one element, then empty! the vector and I believe that’d solve my problem, but I’m still curious regarding my question.

I think the following should work:

fieldnames(fieldtype(eltype(empty_array), 1))

Or you could define a function like

thekeys(::Type{<:MutableNamedTuple{N}}) where {N} = N
# usage:
thekeys(eltype(empty_array))

Interesting, thanks for the suggestion.

fieldtype(eltype(empty_array), 1))

returns a

NamedTuple{@NamedTuple{a::Float64, b::Float64}, T} where T<:Tuple{Vararg{Ref}}

but

fieldnames(fieldtype(eltype(ts), 1))

yields a

Internal Error: ArgumentError: type does not have a definite number of fields

I didn’t expect this error, perhaps someone else knows the cause

Huh, what’s eltype(empty_array)? Can you provide a dummy example of the actual instances you’re working with?

Also, before we get too deep, let me just flag that NamedTuple{N,T} where {T<:Tuple{Vararg{Ref}}} is unlikely to be the best datatype for any problem. You’re putting every single element of every namedtuple in its own Ref. That’s a lot of tiny heap allocations and a lot of indirection! If these instances mainly exist within arrays, they may not even need to be mutable types themselves because the array as a whole is mutable.

I recommend taking a look at Accessors.jl for some idea of what’s possible in the realm of mutating the immutable.

eltype(empty_array)

returns

MutableNamedTuple{@NamedTuple{a::Float64, b::Float64}}

I didn’t write the original codebase. It’s a simulation node graph system with types and functions that are only known at runtime, and the structure I’m having trouble with is used to preallocate the one that’ll hold the output values.

Profiling memory allocation didn’t showcase any kind of problematic heap allocations after preallocation.

Here’s what I’m seeing when experimenting with this:

julia> struct MutableNamedTuple{N,T <: Tuple{Vararg{Ref}}}
           nt::NamedTuple{N, T}
       end

julia> v = [MutableNamedTuple((; a=Ref(1), b=Ref(2)))];

julia> ft = fieldtype(eltype(v), 1)
@NamedTuple{a::Base.RefValue{Int64}, b::Base.RefValue{Int64}}

julia> fieldnames(ft)
(:a, :b)

This indicates that the corresponding MutableNamedTuple instances are using a NamedTuple type as the list of keys, rather than the typical (:a, :b) or similar. I’m surprised this is possible, and it certainly doesn’t seem like what the author of MutableNamedTuple intended. But I may be wrong, I’m lacking the bigger context.

…yeah, you can construct the type, but I can’t find a way to instantiate it.

@danielwe Your example case works but your vector isn’t empty.

I guess with preallocation I wouldn’t encounter any heap allocation costs until I need to grow the vector, which isn’t ideal. I need to think more about the underlying problem, in any case.

(Incidentally, one look at Accessors.jl’s documentation tells me absolutely nothing of its underlying memory model, I have no clue what it does under the hood and if it could fit my needs)

That’s not relevant, the point is I’m using eltype rather than accessing an element. It would work the same on an empty vector with the same eltype. Maybe the key problem here is that there’s a bug in way you’re instantiating your empty vector, such that it obtains a meaningless eltype? Some example code showing how you arrive at your empty_array would be super helpful!

Accessors.jl doesn’t have a memory model, it just gives you a few convenience macros for manipulating nested Julia structures. For example, it lets you do the following:

julia> using Accessors

julia> v = [(a=1, b=2), (a=3, b=4)]
2-element Vector{@NamedTuple{a::Int64, b::Int64}}:
 (a = 1, b = 2)
 (a = 3, b = 4)

julia> v[2] = @set $(v[2]).b = 20;

julia> v
2-element Vector{@NamedTuple{a::Int64, b::Int64}}:
 (a = 1, b = 2)
 (a = 3, b = 20)

Notice that the elements of the vector v are immutable NamedTuples; still I was able to update one field of one tuple with a single line of code. The point is that you can have immutable array elements and still be able to update their contents since the vector itself is mutable.

struct MutableNamedTuple{N,T <: Tuple{Vararg{Ref}}}
    nt::NamedTuple{N, T}
end

named_tuple = (a = 1, b = 2.0)
typeof(named_tuple)

v = MutableNamedTuple{typeof(named_tuple)}[]
ft = fieldtype(eltype(v), 1)
fieldnames(ft)

if you use an empty vector the way I do here, you’ll get the error I mentioned earlier.

ERROR: ArgumentError: type does not have a definite number of fields

In any case, the whole point of this is to preallocate an output structure for selected datatypes I don’t know the exact amount (but can guess to limit reallocations) and type of at compile-time. So that once the node graph simulations update their own structures every timestep, I can just update fields in the final output structure while limiting reallocations.

I can’t tell at a glance whether the @set macro reallocates a whole NamedTuple under the hood, which sounds like it might be more costly than a Ref indirection.

Conceptually what I’d like to do is a preallocate( sizeof(output_structure) * guess_amount), in such a way that if a reallocation occurs it could just reallocate one block and copy the old raw data into it, and then keep going without any reallocations.

Here’s your problem. You’re declaring the wrong type and won’t be able to use this v for anything. See:

julia> v = MutableNamedTuple{typeof((a=1, b=2))}[]
MutableNamedTuple{@NamedTuple{a::Int64, b::Int64}}[]

julia> push!(v, (a=1, b=2))
ERROR: MethodError: Cannot `convert` an object of type
  @NamedTuple{a::Int64, b::Int64} to an object of type
  MutableNamedTuple{@NamedTuple{a::Int64, b::Int64}}

Closest candidates are:
  convert(::Type{T}, ::T) where T
   @ Base Base.jl:84

Stacktrace:
 [1] push!(a::Vector{MutableNamedTuple{@NamedTuple{a::Int64, b::Int64}}}, item::@NamedTuple{a::Int64, b::Int64})
   @ Base ./array.jl:1118
 [2] top-level scope
   @ REPL[3]:1

Instead, you should do something like

named_tuple = (a = Ref(1), b = Ref(2.0))
MNT = typeof(MutableNamedTuple(named_tuple))
v = MNT[]
ft = fieldtype(eltype(v), 1)
fieldnames(ft)

Immutable objects like NamedTuples live on the stack and are stored inline in concretely typed arrays. The way I’m using the @set macro does indeed semantically create a new NamedTuple, but it’s never heap allocated, and the compiler may be able to optimize away its existence entirely. The net effect is you’re simply overwriting a chunk of an already allocated vector, without any extra allocation or indirection at all. In contrast, a Ref always incurs heap allocation and pointer indirection.

That’s if the array eltype is concrete or a small union. If your array has an abstract eltype, which sounds like might be the case (since you don’t know all the elements’s types in advance?), there will be an extra level of indirection between the vector and its elements, and the new NamedTuple must indeed be allocated on the heap, so your concern is valid.

1 Like

For reference, here’s how I might implement a simple MutableNamedTuple if a vector of NamedTuples doesn’t fit the use case:

using Accessors

mutable struct MutableNamedTuple{NT<:NamedTuple}
    nt::NT
end

function Base.getproperty(mnt::MutableNamedTuple, name::Symbol)
    return getproperty(getfield(mnt, :nt), name)
end

function Base.setproperty!(mnt::MutableNamedTuple, name::Symbol, x)
    return setfield!(mnt, :nt, set(getfield(mnt, :nt), PropertyLens{name}(), x))
end
julia> mnt = MutableNamedTuple((a=1, b=2.0))
MutableNamedTuple{@NamedTuple{a::Int64, b::Float64}}((a = 1, b = 2.0))

julia> mnt.a
1

julia> mnt.b
2.0

julia> mnt.a = 5
5

julia> mnt
MutableNamedTuple{@NamedTuple{a::Int64, b::Float64}}((a = 5, b = 2.0))

julia> mnt.b = -Inf
-Inf

julia> mnt
MutableNamedTuple{@NamedTuple{a::Int64, b::Float64}}((a = 5, b = -Inf))

This keeps the entire tuple together in a single heap-allocated mutable struct, rather than giving each and every element it’s own Ref. It does overwrite the entire tuple for every update, but my hunch is that this is almost always preferable to holding atomized Refs, especially considering that memory reads and writes happen in chunks the size of multiple elements anyway (typically 64 bytes, i.e., 8 Ints/Float64s, if I’m not mistaken).

Alright, thanks for the replies.

I need to delve more into this topic, as I don’t fully understand some of the type subtleties, and the MutableNamedTuple approach might be inadequate (although profiling didn’t come up with any heap allocation issues).

It still feels like the type system is not being as helpful as I’d expect, where I’d like to work with and allocate/copy raw data in big chunks on the heap and have the introspection ease size calculations for runtime types, but instead it feels like I have to jump through hoops and will need to spend time unpacking abstraction layers to understand where memory goes.

But I’m also not experienced enough with Julia, and the codebase is a little unusual, so the issue could be on my end.

Just reiterating that the main issue underlying your original post was a simple coding error, where MutableNamedTuple{typeof(named_tuple)} is not the type you maybe thought it was. This is easy to fix, for example, as I wrote above:

If you need the element type to be wider, that’s also straightforward; for example, if the keys should be fixed but the values can have any type (albeit wrapped in Ref), you can do something like this:

julia> v = MutableNamedTuple{(:a, :b)}[]
MutableNamedTuple{(:a, :b)}[]

julia> fieldnames(fieldtype(eltype(v), 1))
(:a, :b)

julia> push!(v, MutableNamedTuple((a=Ref(1), b=Ref(2.0))));

julia> push!(v, MutableNamedTuple((a=Ref("three"), b=Ref(0x4))));

julia> v
2-element Vector{MutableNamedTuple{(:a, :b)}}:
 MutableNamedTuple{(:a, :b), Tuple{Base.RefValue{Int64}, Base.RefValue{Float64}}}((a = Base.RefValue{Int64}(1), b = Base.RefValue{Float64}(2.0)))
 MutableNamedTuple{(:a, :b), Tuple{Base.RefValue{String}, Base.RefValue{UInt8}}}((a = Base.RefValue{String}("three"), b = Base.RefValue{UInt8}(0x04)))

For the deeper design questions, the jumping through hoops and the unpacking abstraction layers, it’s hard to give more advice without seeing an actual complete code example that I can paste into my own REPL and run to see exactly the same output/errors as you’re seeing on your end. Then we can discuss what modifications might be needed to make it behave like you want. Maybe what’s needed is just some construction and conversion methods, for example, from a NamedTuple like (a=1, b=2.0) to the mutable equivalent, such that you can do push!(v, (a=1, b=2.0)) instead of push!(v, MutableNamedTuple((a=Ref(1), b=Ref(2.0)))? That’s super straightforward—I just don’t know whether that’s what you need. Or, given your talk about copying raw data in big chunks, maybe what you need is to move easily back and forth between working with raw bits and the same bits interpreted as instances of types that are only known at runtime? That’s somewhat more low-level, but could definitely be done, although it might require a redesign to ensure that said types are so-called bitstypes (i.e., contain no internal pointer indirection under the hood).

If you’re able to share a runnable code example, I’d be happy to take a look (hopefully others with more expertise can also weigh in, I’m a bit of a dilettante myself). In either case, best of luck!

It occurred to me that you might be interested in the StructArrays.jl package. It provides a struct-of-array type that looks like an array of structs when needed.

julia> using StructArrays

julia> v = StructArray((a=Any[], b=Any[]))
0-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype @NamedTuple{a, b}

julia> push!(v, (a=1, b=2.0))
1-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype @NamedTuple{a, b}:
 @NamedTuple{a, b}((1, 2.0))

julia> push!(v, (a="three", b=0x4))
2-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype @NamedTuple{a, b}:
 @NamedTuple{a, b}((1, 2.0))
 @NamedTuple{a, b}(("three", 0x04))

julia> v[1]  # looks like an array of NamedTuples when indexed/iterated
@NamedTuple{a, b}((1, 2.0))

julia> v[2]
@NamedTuple{a, b}(("three", 0x04))

julia> v.a  # actually holds one array per field
2-element Vector{Any}:
 1
  "three"

julia> v.b
2-element Vector{Any}:
    2.0
 0x04

julia> v.a[1] = :one;  # let's mutate an element

julia> v
2-element StructArray(::Vector{Any}, ::Vector{Any}) with eltype @NamedTuple{a, b}:
 @NamedTuple{a, b}((:one, 2.0))
 @NamedTuple{a, b}(("three", 0x04))

julia> v[1]
@NamedTuple{a, b}((:one, 2.0))

Oho, thanks a lot. Sorry, I haven’t returned to that problem long enough in the past couple of days to have more to add (and a runnable code example would probably not be that trivial to extract, I’ll need to think about it).

I’m not 100% sure I’ll make structural changes to the outputs in the near terms, it’ll depend on other testing/profiling, but StructsArray looks like it might indeed be practical for my use case. I’ll look into this next week.

1 Like