Get fieldnames and values of `struct` as NamedTuple


#1

After the discussion in this thread I was wondering if there is in Julia 0.7 a way to get a NamedTuple starting from a struct. For example:

julia> struct Person
       age::Float64
       name::String
       end

julia> p = Person(34, "Jane")
Person(34.0, "Jane")

julia> (;(v=>getfield(p, v) for v in fieldnames(typeof(p)))...)
(age = 34.0, name = "Jane")

Is there a built-in function for this or is this the recommended way to go from a given struct to NamedTuple? The application I have in mind is from the thread linked above: saving an Array of struct in tabular format (as several columns) using NamedTuples (it would work very well with JuliaDB).


#2

Since any particular instance of a NamedTuple is a functional subset of a named struct type, it seems like you could just open an issue on the table package to remove the type restriction / enable support for use with any type.


#3

I had quite some fun with this one. I think the recommended way is the following way via generated functions (you proposed the naive way):

to_named_tuple_naive(p) = (; (v=>getfield(p, v) for v in fieldnames(typeof(p)))...)

gentup(struct_T) = NamedTuple{( fieldnames(struct_T)...,), Tuple{(fieldtype(struct_T,i) for i=1:fieldcount(struct_T))...}}

@generated function to_named_tuple_generated(x)
           nt = Expr(:quote, gentup(x))
           tup = Expr(:tuple)
           for i=1:fieldcount(x)
               push!(tup.args, :(getfield(x, $i)) )
           end
           return :($nt($tup))
       end

So, let’s compare!

using BenchmarkTools
struct Person
       age::Float64
       name::String
       end

struct point
    x::Float64 
    y::Float64 
end
pers=Person(34, "Jane")
pt = point(3.4, 4.5)
@show to_named_tuple_naive(pers);
#to_named_tuple_naive(pers) = (age = 34.0, name = "Jane")
@show to_named_tuple_generated(pers);
#to_named_tuple_generated(pers) = (age = 34.0, name = "Jane")

@btime to_named_tuple_naive($pers);
#4.471 ��s (25 allocations: 1.27 KiB)
@btime to_named_tuple_naive($pt);
#2.772 ��s (21 allocations: 1.17 KiB)

@btime to_named_tuple_generated($pers);
#8.554 ns (1 allocation: 32 bytes)
@btime to_named_tuple_generated($pt);
#2.208 ns (0 allocations: 0 bytes)

PS.

versioninfo()
#Julia Version 0.7.0-DEV.3943
#Commit dcc39f4d8d* (2018-02-09 22:47 UTC)

Simulation Framework with Logging to Tabular Data
#4

Is there an easy way to modify this to run in 0.6.2 ? Neither the naive nor generated function method seems to work (including after modifying to not use fieldcount). Thx


#5

NamedTuples in Julia 0.6 are provided by a the NamedTuples package. I’m not sure how to create a NamedTuple with that package without explicitly typing in the fields.


#6

Ahh. OK. Thanks. Guess I have to get 0.7 dev running to give this a whirl.


#7

Is there an inverse (named_tuple_to_struct_generated)?


#8

The boring way is to just splat the tuple into the constructor (as long as the default constructor exists):

persnt = to_named_tuple_generated(pers);
@btime Person($(persnt)...)
#319.253 ns (3 allocations: 160 bytes)
#Person(34.0, "Jane")
ptnt = to_named_tuple_generated(pt);
@btime point($(ptnt)...)
#  336.827 ns (5 allocations: 208 bytes)
#point(3.4, 4.5)

Afaik there is some PR for improving the speed of named tuple splatting underway. I am slightly shocked at how slow this is, but currently too lazy to write a generated function for it, and would wait-and-see whether this goes away on its own.

If you can un-name the tuples then it gets fast:

ptunt=(ptnt...,)
@btime point($(ptunt)...)
#  2.211 ns (0 allocations: 0 bytes)
#point(3.4, 4.5)

But I currently don’t know how to cheaply strip the names (probably another @generated).


#9

Ok, not so lazy.

@generated function strip_names(x)
           tup = Expr(:tuple)
           for i=1:length(x.types)
               push!(tup.args, :(getfield(x, $i)) )
           end
           return :($tup)
       end
@btime Person(strip_names($persnt)...)
#8.491 ns (1 allocation: 32 bytes)
#Person(34.0, "Jane")
@btime point(strip_names($ptnt)...)
#  2.209 ns (0 allocations: 0 bytes)
#point(3.4, 4.5)

Maybe I should submit a PR for the faster splat.

edit: Because I always get confused about what @btime actually measures:

fpers(z)=Person(strip_names(z)...)
@btime fpers($persnt)
#  8.502 ns (1 allocation: 32 bytes)
#Person(34.0, "Jane")

#10

I think your work deserves a PR. I understand @jameson opinion that in theory everything that can be done with a NamedTuple could be done with the struct directly, but in practice such a large chunk of the data ecosystem is organized around NamedTuples that I do not believe it can be adjusted to work with struct in general, which IMHO makes your work quite useful.


#11

Nope, just tested https://github.com/JuliaLang/julia/pull/26025; it gives the same speed for strip_names, just with nicer code.

So kudos to andyferris; better wait until his PR is merged or copy-paste the code from his PR if you can’t wait.

edit: Once I fix my personal git-hell I will maybe submit a PR for the struct-to-named-tuple conversion, if you think it is useful.


#12

thank you.

there is this effort too, FastSplat which does speed splatting. I get ~6x (0.6.2) and ~3x (0.7.0-DEV).


#13

T(fields...) is only a constructor if the type doesn’t define a constructor. Similarly, to_named_tuple_generated would be wrong for any type that defines getproperty. It might make sense for this code to live in a package, with a “buyer-beware” warning, but I don’t think we should put it in base. This is also coupled to the reason Julia APIs often avoid using dot-oriented accessors (specifically, because it encourages excess coupling of a type’s data layout and usage & hinders effective usage of dispatch).


#14

Thanks!

So @piever the best is to copy-paste the struct_to_named_tuple into your code or maybe submit it to some place where it fits (e.g. the packages that use named tuples in the data-frame/table ecosystem; you obviously have my permission to do so, I don’t need credit for these 5 lines; it’s just like all the pointer-based non-allocating array-views that everyone copy-pastes and modifies). If it doesn’t belong in base then it doesn’t belong in base, so I can procrastinate fixing my git for some more time, yay!.

Splatting of named tuples will be reasonably fast once https://github.com/JuliaLang/julia/pull/26025 has landed, so the strip_names can be forgotten (modulo maybe unions: Tuples are covariant and named tuples are not, so I’m not sure whether there lurk performance dragons when splatting named tuples with Union{Int,Missing} fields).


#15

I also thought this probably belongs to some data package then (possibly one of the IterableTables group), I’ll make a PR when those packages update to Julia 0.7, thanks again for your contribution.


#16

I think @jameson is probably right that we should just relax the requirement that things have to be a named tuple and accept any struct everywhere. Query.jl already has no special treatment for named tuples, and it might make sense to do the same for the iterable tables interface. I think I want to think a bit more about this, but that is my current view.