Get fieldnames and values of `struct` as NamedTuple

piever · February 11, 2018, 12:49pm

After the discussion in this thread I was wondering if there is in Julia 0.7 a way to get a NamedTuple starting from a struct. For example:

julia> struct Person
       age::Float64
       name::String
       end

julia> p = Person(34, "Jane")
Person(34.0, "Jane")

julia> (;(v=>getfield(p, v) for v in fieldnames(typeof(p)))...)
(age = 34.0, name = "Jane")

Is there a built-in function for this or is this the recommended way to go from a given struct to NamedTuple? The application I have in mind is from the thread linked above: saving an Array of struct in tabular format (as several columns) using NamedTuples (it would work very well with JuliaDB).

jameson · February 11, 2018, 11:50pm

Since any particular instance of a NamedTuple is a functional subset of a named struct type, it seems like you could just open an issue on the table package to remove the type restriction / enable support for use with any type.

foobar_lv2 · February 11, 2018, 11:55pm

I had quite some fun with this one. I think the recommended way is the following way via generated functions (you proposed the naive way):

to_named_tuple_naive(p) = (; (v=>getfield(p, v) for v in fieldnames(typeof(p)))...)

gentup(struct_T) = NamedTuple{( fieldnames(struct_T)...,), Tuple{(fieldtype(struct_T,i) for i=1:fieldcount(struct_T))...}}

@generated function to_named_tuple_generated(x)
           nt = Expr(:quote, gentup(x))
           tup = Expr(:tuple)
           for i=1:fieldcount(x)
               push!(tup.args, :(getfield(x, $i)) )
           end
           return :($nt($tup))
       end

So, let’s compare!

using BenchmarkTools
struct Person
       age::Float64
       name::String
       end

struct point
    x::Float64 
    y::Float64 
end
pers=Person(34, "Jane")
pt = point(3.4, 4.5)
@show to_named_tuple_naive(pers);
#to_named_tuple_naive(pers) = (age = 34.0, name = "Jane")
@show to_named_tuple_generated(pers);
#to_named_tuple_generated(pers) = (age = 34.0, name = "Jane")

@btime to_named_tuple_naive($pers);
#4.471 ��s (25 allocations: 1.27 KiB)
@btime to_named_tuple_naive($pt);
#2.772 ��s (21 allocations: 1.17 KiB)

@btime to_named_tuple_generated($pers);
#8.554 ns (1 allocation: 32 bytes)
@btime to_named_tuple_generated($pt);
#2.208 ns (0 allocations: 0 bytes)

PS.

versioninfo()
#Julia Version 0.7.0-DEV.3943
#Commit dcc39f4d8d* (2018-02-09 22:47 UTC)

philip · February 12, 2018, 5:57pm

Is there an easy way to modify this to run in 0.6.2 ? Neither the naive nor generated function method seems to work (including after modifying to not use fieldcount). Thx

piever · February 12, 2018, 6:28pm

NamedTuples in Julia 0.6 are provided by a the NamedTuples package. I’m not sure how to create a NamedTuple with that package without explicitly typing in the fields.

philip · February 12, 2018, 11:04pm

Ahh. OK. Thanks. Guess I have to get 0.7 dev running to give this a whirl.

JeffreySarnoff · February 13, 2018, 7:20am

Is there an inverse (named_tuple_to_struct_generated)?

foobar_lv2 · February 13, 2018, 7:51pm

The boring way is to just splat the tuple into the constructor (as long as the default constructor exists):

persnt = to_named_tuple_generated(pers);
@btime Person($(persnt)...)
#319.253 ns (3 allocations: 160 bytes)
#Person(34.0, "Jane")
ptnt = to_named_tuple_generated(pt);
@btime point($(ptnt)...)
#  336.827 ns (5 allocations: 208 bytes)
#point(3.4, 4.5)

Afaik there is some PR for improving the speed of named tuple splatting underway. I am slightly shocked at how slow this is, but currently too lazy to write a generated function for it, and would wait-and-see whether this goes away on its own.

If you can un-name the tuples then it gets fast:

ptunt=(ptnt...,)
@btime point($(ptunt)...)
#  2.211 ns (0 allocations: 0 bytes)
#point(3.4, 4.5)

But I currently don’t know how to cheaply strip the names (probably another @generated).

foobar_lv2 · February 13, 2018, 8:15pm

Ok, not so lazy.

@generated function strip_names(x)
           tup = Expr(:tuple)
           for i=1:length(x.types)
               push!(tup.args, :(getfield(x, $i)) )
           end
           return :($tup)
       end
@btime Person(strip_names($persnt)...)
#8.491 ns (1 allocation: 32 bytes)
#Person(34.0, "Jane")
@btime point(strip_names($ptnt)...)
#  2.209 ns (0 allocations: 0 bytes)
#point(3.4, 4.5)

Maybe I should submit a PR for the faster splat.

edit: Because I always get confused about what @btime actually measures:

fpers(z)=Person(strip_names(z)...)
@btime fpers($persnt)
#  8.502 ns (1 allocation: 32 bytes)
#Person(34.0, "Jane")

piever · February 13, 2018, 8:31pm

I think your work deserves a PR. I understand @jameson opinion that in theory everything that can be done with a NamedTuple could be done with the struct directly, but in practice such a large chunk of the data ecosystem is organized around NamedTuples that I do not believe it can be adjusted to work with struct in general, which IMHO makes your work quite useful.

foobar_lv2 · February 13, 2018, 8:35pm

Nope, just tested https://github.com/JuliaLang/julia/pull/26025; it gives the same speed for strip_names, just with nicer code.

So kudos to andyferris; better wait until his PR is merged or copy-paste the code from his PR if you can’t wait.

edit: Once I fix my personal git-hell I will maybe submit a PR for the struct-to-named-tuple conversion, if you think it is useful.

JeffreySarnoff · February 13, 2018, 9:43pm

thank you.

there is this effort too, FastSplat which does speed splatting. I get ~6x (0.6.2) and ~3x (0.7.0-DEV).

jameson · February 13, 2018, 10:26pm

T(fields...) is only a constructor if the type doesn’t define a constructor. Similarly, to_named_tuple_generated would be wrong for any type that defines getproperty. It might make sense for this code to live in a package, with a “buyer-beware” warning, but I don’t think we should put it in base. This is also coupled to the reason Julia APIs often avoid using dot-oriented accessors (specifically, because it encourages excess coupling of a type’s data layout and usage & hinders effective usage of dispatch).

foobar_lv2 · February 13, 2018, 10:59pm

Thanks!

So @piever the best is to copy-paste the struct_to_named_tuple into your code or maybe submit it to some place where it fits (e.g. the packages that use named tuples in the data-frame/table ecosystem; you obviously have my permission to do so, I don’t need credit for these 5 lines; it’s just like all the pointer-based non-allocating array-views that everyone copy-pastes and modifies). If it doesn’t belong in base then it doesn’t belong in base, so I can procrastinate fixing my git for some more time, yay!.

Splatting of named tuples will be reasonably fast once https://github.com/JuliaLang/julia/pull/26025 has landed, so the strip_names can be forgotten (modulo maybe unions: Tuples are covariant and named tuples are not, so I’m not sure whether there lurk performance dragons when splatting named tuples with Union{Int,Missing} fields).

piever · February 13, 2018, 11:09pm

I also thought this probably belongs to some data package then (possibly one of the IterableTables group), I’ll make a PR when those packages update to Julia 0.7, thanks again for your contribution.

davidanthoff · February 25, 2018, 1:35am

I think @jameson is probably right that we should just relax the requirement that things have to be a named tuple and accept any struct everywhere. Query.jl already has no special treatment for named tuples, and it might make sense to do the same for the iterable tables interface. I think I want to think a bit more about this, but that is my current view.

oschulz · November 21, 2018, 10:49am

Do we have conversion between structs and named tuples available in some package by now? If not, @JeffreySarnoff, would you accept something like this as an addition to NamedTupleTools.jl?

JeffreySarnoff · November 21, 2018, 11:04am

sure

JeffreySarnoff · November 21, 2018, 11:59am

Just tagged v0.4.0

using NamedTupleTools

julia> struct MyStruct
           tally::Int
           team::String
       end

julia> mystruct = MyStruct(5, "hometeam")
MyStruct(5, "hometeam")

julia> mynamedtuple = NamedTuple(mystruct)
(tally = 5, team = "hometeam")

julia> mystructtuple = (tally = 18, team = "vistors")
(tally = 18, team = "vistors")

julia> myotherstruct = convert(MyStruct, mystructtuple)
MyStruct(18, "vistors")

oschulz · November 21, 2018, 12:08pm

Oh, nice, thanks!

One question: NamedTupleTools doesn’t really “own” NamedTuple, but defines methods like length (and now convert) for it. Are you aware of any other packages that do that too and could cause ambiguities? Or can we consider NamedTupleTools as “official” enough that that’s fine?