Type-stable Array of NamedTuples

robsmith11 · December 27, 2019, 5:23pm

I often want to do the following for a large n where I perform some calculations and return a tuple of values with the result of each iteration:

function f(n::Int)
  v = Vector()
  for i in 1:n 
    ...
    push!(v, (a=i, b=1.23, c=false, d=:foo, ...))
  end
  v
end

As written, the code is less efficient than it could be because v will have type Array{Any,1}. The type instability can be fixed by explicitly specifying the type of the NamedTuple, but that is cumbersome as the tuple’s size grows.

Is there any way to tell Julia that I only want to push a single type to the Array so that it can either

automatically infer the type from the push!, or
wait to initiate the Array until the first push! call so that the type will be known at that time?

bashonubuntu · December 27, 2019, 5:26pm

Check out

https://github.com/JuliaComputing/IndexedTables.jl

IndexedTables provide the backend for JuliaDB which you can also check out

https://github.com/JuliaComputing/JuliaDB.jl

There are some type-instabilities you can’t fully resolve. That’s fine, just put that piece of code inside a function-barrier. See below

https://docs.julialang.org/en/v1/manual/performance-tips/#kernel-functions-1

ericphanson · December 27, 2019, 5:36pm

Maybe push!! from BangBang.jl could help? Not sure though.

sairus7 · December 27, 2019, 5:39pm

You can use typeof to get a type of a variable and construct vector of that type:

function foo(n::Int)
  v = nothing
  for i in 1:n 
    if v === nothing
        v = Vector{typeof((a=i, b=1.23, c=false, d=:foo))}(undef, n)
    end
    # push!(v, (a=i, b=1.23, c=false, d=:foo))
    v[i] = (a=i, b=1.23, c=false, d=:foo)
  end
  v
end

But better if you know variable type at initialization.

You can use StructArrays.jl if you need to work on vectors of individual fields.

There is also map syntax:

map(v -> f(v), 1:n)

and list comprehension:
[f(v) for v in 1:n]

mohamed82008 · December 27, 2019, 5:41pm

See this discussion Ridiculous idea: types from the future.

bashonubuntu · December 27, 2019, 5:57pm

Does this scale if the tuple size grows?

sairus7 · December 27, 2019, 6:04pm

Do you mean that:
a) Tuple can be of different size between iterations inside one function call, or
b) Tuple size is big for some particular functions, but remain constant within one call?

bashonubuntu · December 27, 2019, 6:07pm

I mean b), if I understood correctly. Do you think that if I had a large number of elements in the tuple, say 100, initializing a vector with typeof() will be cumbersome? Maybe there is an automatic way to do this too.

sairus7 · December 27, 2019, 6:22pm

Consider splitting one big tuple in a group of smaller tuples or structures, like

t1 = (a, b, c, d, e)
s1 = MyStruct(f,g,h)
t = (t1, s1)

Or split your data and processing into different vectors and functions.

Also tuples are not very efficient for very large number of elements.

If you need to fill one big table, there are such packages as DataFrames or JuliaDB.

bashonubuntu · December 27, 2019, 6:23pm

Perfect, that’s exactly why I linked to IndexedTables.jl above

rdeits · December 27, 2019, 7:44pm

As discussed over in Ridiculous idea: types from the future (as @mohamed82008 mentioned), there is a pretty nice way to do this with map:

function f(n)
  map(1:n) do i
    (a=i, b=1.23, c=false, d=:foo)
  end
end

This is a pretty nice solution, since it’s shorter than the original code while being type-stable:

julia> f(5)
5-element Array{NamedTuple{(:a, :b, :c, :d),Tuple{Int64,Float64,Bool,Symbol}},1}:
 (a = 1, b = 1.23, c = 0, d = :foo)
 (a = 2, b = 1.23, c = 0, d = :foo)
 (a = 3, b = 1.23, c = 0, d = :foo)
 (a = 4, b = 1.23, c = 0, d = :foo)
 (a = 5, b = 1.23, c = 0, d = :foo)

julia> @code_warntype f(5)
Variables
  #self#::Core.Compiler.Const(f, false)
  n::Int64
  #3::getfield(Main, Symbol("##3#4"))

Body::Array{NamedTuple{(:a, :b, :c, :d),Tuple{Int64,Float64,Bool,Symbol}},1}

robsmith11 · December 27, 2019, 8:04pm

map is indeed a good solution. I hadn’t realized that map would allow updating of state as it iterates over a vector, but it does work fine.

BTW, I’d mark this topic as solved, but for some reason the button no longer shows up for me.

EDIT:
However, map restricts one to returning a single result for each iteration. It’s not possible to push multiple or zero results per iteration (without using nested arrays).

Mason · December 27, 2019, 9:21pm

You can return a tuple of multiple (or zero) return results and then call Iterators.flatten on the result.

Topic		Replies	Views
I want to replicate with Named Tuples what I do with Dictionaries but I can't General Usage dictionary , namedtuple	9	2107	November 9, 2021
Named Tuple Constructor type unstable? General Usage	7	1440	April 24, 2019
Why is the NTuple{N} constructor not type stable when constructing a Tuple from a Vector? General Usage question , tuple , ntuple	1	584	January 29, 2021
Type instability with ntuple New to Julia type-stability , ntuple	3	831	July 28, 2018
Reshape into named tuple General Usage namedtuple	4	935	August 16, 2021

Type-stable Array of NamedTuples

Related topics