Reshape into named tuple

I have a vector with elements of possibly different types (but homogeneous) and the same length (>1e8).

Example:

julia> data
3-element Vector{Any}:
 [0.7947257509914385, 0.938439282086563, 0.7825664679861779, 0.9622100725905216, 0.4505364179186473]
 [0.05038215152695846, 0.23018482446598987, 0.3575198683096761, 0.5687945898953082, 0.45331859880110126]
 [0.584780363134745, 0.7596965009119068, 0.011245396510286998, 0.9853654421407447, 0.6357566746779792]

and corresponding field names “:x”, “:y”, “:z”,

How to transform this into an array of named tuples (thus also type-stable),
such that sdata[2].x == 0.938439282086563

The single element can be accessed by

data[1][2], but : combinations always yield a 5-element vector.

Found this similar thread, but not sure if applicable.

Probably not very elegant but it works:

julia> data = [rand(10), rand(10), rand(10)]
3-element Vector{Vector{Float64}}:
 [0.3045813142892593, 0.4122350061672848, 0.4735767061019649, 0.2644570924010421, 0.048985221246464095, 0.5999148629442601, 0.93
38244833695397, 0.7867455249568172, 0.28319601162576524, 0.13429265527502454]
 [0.5435428640235926, 0.876153723408583, 0.1650348423847161, 0.7847454441156623, 0.5331547615539582, 0.12257495022585352, 0.3824
331934854508, 0.4964234599222994, 0.8902713057487648, 0.6755358224731103]
 [0.19611420814764946, 0.399766030972579, 0.5676998793707868, 0.7451572134846107, 0.007488060584806666, 0.34447529370151586, 0.3
635575672925839, 0.27969842837880576, 0.3497395687887557, 0.23772803836323408]

julia> nt = collect((x=data[1][i], y=data[2][i], z=data[3][i]) for i in 1:length(data[1]))
10-element Vector{NamedTuple{(:x, :y, :z), Tuple{Float64, Float64, Float64}}}:
 (x = 0.3045813142892593, y = 0.5435428640235926, z = 0.19611420814764946)
 (x = 0.4122350061672848, y = 0.876153723408583, z = 0.399766030972579)
 (x = 0.4735767061019649, y = 0.1650348423847161, z = 0.5676998793707868)
 (x = 0.2644570924010421, y = 0.7847454441156623, z = 0.7451572134846107)
 (x = 0.048985221246464095, y = 0.5331547615539582, z = 0.007488060584806666)
 (x = 0.5999148629442601, y = 0.12257495022585352, z = 0.34447529370151586)
 (x = 0.9338244833695397, y = 0.3824331934854508, z = 0.3635575672925839)
 (x = 0.7867455249568172, y = 0.4964234599222994, z = 0.27969842837880576)
 (x = 0.28319601162576524, y = 0.8902713057487648, z = 0.3497395687887557)
 (x = 0.13429265527502454, y = 0.6755358224731103, z = 0.23772803836323408)
1 Like
julia> data = Any[rand(5), rand(Int8, 5), rand(Bool, 5)]
3-element Vector{Any}:
 [0.3005737233263839, 0.9001020036533961, 0.23278850839511744, 0.28874192844445024, 0.5812484239516024]
 Int8[78, 45, 82, -28, 106]
 Bool[0, 1, 1, 0, 1]

julia> nt = (; zip([:x; :y; :z], data)...)
(x = [0.3005737233263839, 0.9001020036533961, 0.23278850839511744, 0.28874192844445024, 0.5812484239516024], y = Int8[78, 45, 82, -28, 106], z = Bool[0, 1, 1, 0, 1])

julia> typeof(nt)
NamedTuple{(:x, :y, :z), Tuple{Vector{Float64}, Vector{Int8}, Vector{Bool}}}

julia> nt.y[3]
82

EDIT: Actually I think you wanted this:

julia> NamedTuple.(zip.(Ref([:x; :y; :z]), zip(data...)))
5-element Vector{NamedTuple{(:x, :y, :z), Tuple{Float64, Int8, Bool}}}:
 (x = 0.3005737233263839, y = 78, z = 0)
 (x = 0.9001020036533961, y = 45, z = 1)
 (x = 0.23278850839511744, y = 82, z = 1)
 (x = 0.28874192844445024, y = -28, z = 0)
 (x = 0.5812484239516024, y = 106, z = 1)
2 Likes

Thanks for the start!

Would need to extend it to process the fieldnames (number and values read from IO).
Maybe there is no really elegant solution, its part of the “array of struct” deficiency: I can store/transmit/load homogeneous streams but in the end have to re-sort them again. Not to be recommended for large number of structs.

Could we think of other data structures than named tuples here?
Allowing incremental built-up, not requiring copying and data on the heap etc.

Exactly, thanks a lot!