Unroll setfield!

I need to unroll a loop calling setfield! to build a data structure. A MWE is below:

julia> @kwdef mutable struct Data32
           a::Int32 = 0
           b::Float32 = 0
           c::String = ""
           d::Int32 = 0
           e::Float32 = 0
       end
Data32

#
# I have tuples for the field names, values, and types which I want to modify
#
julia> field_names = (:a, :b, :c); field_values=(1, 1.0, "X"); field_types=(Int32, Float32, String)
(Int32, Float32, String)

#
# Manual setting the fields is very fast:
#
julia> function setup_data_manual!(data, field_names, field_values, field_types)
           setfield!(data, :a, convert(Int32, field_values[1]))
           setfield!(data, :b, convert(Float32, field_values[2]))
           setfield!(data, :c, convert(String, field_values[3]))
           return data
       end
setup_data_manual! (generic function with 1 method)

julia> @btime setup_data_manual!($(Data32()), $field_names, $field_values, $field_types)
  1.852 ns (0 allocations: 0 bytes)
Data32(1, 1.0f0, "X", 0, 0.0f0)

#
# But looping over the fields is type-unstable and slow:
#
julia> function setup_data!(data, field_names, field_values, field_types)
           for (i,f) in enumerate(field_names)
               setfield!(data, field_names[i], convert(field_types[i], field_values[i]))
           end
           return data
       end
setup_data (generic function with 2 methods)

julia> @btime setup_data!($(Data32()), $field_names, $field_values, $field_types)
  245.451 ns (5 allocations: 128 bytes)
Data32(1, 1.0f0, "X", 0, 0.0f0)

Do I need macro (or generated) function for that? Would it be easy to write? (I honestly have a hard time understanding the docs and examples of macro writing - if I ever understand that I’ll write a manual for dummies).

It’s probably possible to make your code efficient with some tricks, but a cleaner approach could be to store field names/values together in a NamedTuple from the beginning.
Also, do you really need a mutable struct? Maybe you do, but maybe a regular struct would work even better.

Assuming names/values are stored in a namedtuple and you don’t need mutation:

using ConstructionBase

newfields = (a=1, b=1.0, c="X")
oldstruct = Data32()
newstruct = setproperties(oldstruct, newfields)
1 Like

So there’s a few things going on here. First off, your setup_data_manual! isn’t actually using field_names or field_types, and if it did, you’d find there’s also a performance problem with your manual version since julia is having problems constant propagating the field name symbols, and it’s also suffering from the awful problem that Tuples erase type info when given types:

julia> field_types=(Int32, Float32, String)
(Int32, Float32, String)

julia> typeof(field_types)
Tuple{DataType, DataType, DataType}

:face_vomiting:

I think the best way to hand this is with tuple recursion, and passing some info to the type domain:

unwrap(::Val{T}) where {T} = T


function setup_data_recur!(data, field_names, field_values, field_types) 
    name, names... = field_names
    value, values... = field_values
    type, types... = field_types
    setfield!(data, unwrap(name), convert(unwrap(type), value)) # Note the use of unwrap here
    setup_data_recur!(data, names, values, types)
end

# The base-case
setup_data_recur!(data, ::Tuple{}, ::Tuple{}, ::Tuple{}) = data

field_names_val = (Val(:a), Val(:b), Val(:c))
field_types_val = (Val(Int32), Val(Float32), Val(String))

This gives me

julia> @btime setup_data_recur!($(Data32()), $field_names_val, $field_values, $field_types_val)
  2.484 ns (0 allocations: 0 bytes)
Data32(1, 1.0f0, "X", 0, 0.0f0)
5 Likes

Wow, that I would not imagine. I’ll try to adapt that to my actual case.

Maybe this is off topic, but why the solution is not simply a macro that expands the loop? Clearly I don’t get what macros actually do, because I would imagine that this would be a case where a macro with 4 lines that just expanded the loop into a series of setfield! calls would be easy to write.

Macro operate on something that’s just one level past text. They don’t know anything about types, so they can’t know how long field_values actually is. Generated functions can do it though, here’s what that’d look like:

@generated function setup_data_gen!(data, field_names::NTuple{N, Any}, field_values::NTuple{N, Any}, field_types::NTuple{N, Any}) where {N}
    exprs = map(1:N) do i
        :(setfield!(data, unwrap(field_names[$i]), convert(unwrap(field_types[$i]), field_values[$i])))
    end
    Expr(:block, exprs...)
end

I like generated functions a lot because I find them easier to read and think about than recursion, but the compiler is typically much happier about recursion over Tuples than generated functions, so it’s good to use recursion when possible.

5 Likes

Alternatively though, there is a macro I like to use that basically just unrolls a loop up to a certain number and then falls back to looping after that. This is usually a quite nice compromise ,and I suspect would work well for your case:

macro unroll(N::Int, loop)
    Base.isexpr(loop, :for) || error("only works on for loops")
    Base.isexpr(loop.args[1], :(=)) || error("This loop pattern isn't supported")
    val, itr = esc.(loop.args[1].args)
    body = esc(loop.args[2])
    @gensym loopend
    label = :(@label $loopend)
    goto = :(@goto $loopend)
    out = Expr(:block, :(itr = $itr), :(next = iterate(itr)))
    unrolled = map(1:N) do _
        quote
            isnothing(next) && @goto loopend
            $val, state = next
            $body
            next = iterate(itr, state)
        end
    end
    append!(out.args, unrolled)
    remainder = quote
        while !isnothing(next)
            $val, state = next
            $body
            next = iterate(itr, state)
        end
        @label loopend
    end
    push!(out.args, remainder)
    out
end
macro unroll2(N::Int, loop)
    Base.isexpr(loop, :for) || error("only works on for loops")
    Base.isexpr(loop.args[1], :(=)) || error("This loop pattern isn't supported")
    val, itr = esc.(loop.args[1].args)
    body = esc(loop.args[2])
    label = :(@label loopend)
    goto = :(@goto loopend)
    out = Expr(:block, :(itr = $itr), :(next = iterate(itr)))
    unrolled = map(1:N) do _
        quote
            $val, state = next
            $body
            next = iterate(itr, state)
        end
    end
    append!(out.args, unrolled)
    remainder = quote
        while !isnothing(next)
            $val, state = next
            $body
            next = iterate(itr, state)
        end
        @label loopend
    end
    push!(out.args, remainder)
    quote
        if length($itr) >= N
            $out
        else
            @unroll $N $loop
        end
    end
    out
end

With that, you can just write

function setup_data_unroll!(data, field_names, field_values, field_types)
    @unroll 64 for (i,f) in enumerate(field_names)
        setfield!(data, unwrap(field_names[i]), convert(unwrap(field_types[i]), field_values[i]))
    end
    return data
end

and this will be fast and efficient for lists of names up to 64 entries long, and then fall back to a loop after that.

4 Likes

Is that a problem of tuples?

julia> typeof(Int32)
DataType

julia> typeof(Float32)
DataType

julia> typeof(String)
DataType

It is specifically a problem in Tuple because you have no control over reparameterizing them. Compare the following:

julia> struct Foo{T}
           x::T
       end

julia> Foo(Int)
Foo{DataType}(Int64)

julia> Foo{Type{Int}}(Int)
Foo{Type{Int64}}(Int64)

against

julia> Tuple{Type{Int}}((Int,))
(Int64,)

julia> typeof(ans)
Tuple{DataType}

There is no way to create an instance of the type Tuple{Type{Int}}, unlike regular types.

Isn’t that the difference between Base.typeof and Core.Typeof?

julia> Base.typeof.(field_types)
(DataType, DataType, DataType)

julia> Core.Typeof.(field_types)
(Type{Int32}, Type{Float32}, Type{String})

Yes? I don’t really see why it’s relevant though. Tuple forces a less specific parameter than it could, and this causes performance problems when dealing with Tuples containing types.

There’s no reason we couldn’t make it so that users can at least opt into creating a Tuple{Type{Int}} even if that’s not the default, but this is simply not possible, requiring the above Val trickery.

My point is that typeof gives you DataType on an individual type, I don’t see why you’d expect the tuple to be any different.

I didn’t call typeof anywhere in the code I wrote. (and besides, as I’ve said, I just at least want the option to convert. )

Ah, sorry, I must have misread

then

Ah, yeah, okay, that was to explain why this behaviour was problematic. I thought you meant about the actual problem at hand.

Yes, typeof gives that for a Tuple of types. The problem is that once that’s done, the type info is erased.

julia> Core.Typeof((Int32, Float32, String))
Tuple{DataType, DataType, DataType}

julia> Core.Typeof.((Int32, Float32, String))
(Type{Int32}, Type{Float32}, Type{String})

Or put another way

julia> f(::Tuple{Type{T}}) where {T} = T
f (generic function with 1 method)

julia> f((Int,))
ERROR: UndefVarError: `T` not defined in static parameter matching

This is very unlike if it was a bare type:

julia> f(::Type{T}) where {T} = T
f (generic function with 3 methods)

julia> f(Int)
Int64

because the Tuple constructor is erasing valuable information.

2 Likes

The recursive function worked beautifully, thanks!

1 Like

Sorry, lots of edits here, but here’s the way I would do this, exploiting that ntuple(::Val) is a generated function that writes the unrolled loop for you:

@inline function setup_data!(data::T, names::NTuple{N,Symbol}, values::NTuple{N,Any}) where {T,N}
    ntuple(Val(N)) do i
        @inline
        name, value = names[i], values[i]
        FT = fieldtype(typeof(data), name)
        setfield!(data, name, convert(FT, value))
        nothing
    end
    return data
end

Note that you don’t need to bring the types yourself, let fieldtype take care of that. As for the tuple of nothings that’s created by ntuple but never used for anything, trust the compiler to not instantiate that.


EDIT: Scratch everything below, I didn’t realize that you were only setting a subset of the fields

Here’s a solution I’ve been using to avoid writing my own generated functions or complicated recursions:

@inline function setup_data!(data::T, values::Tuple) where {T}
    @assert length(values) >= fieldcount(T)
    ntuple(Val(fieldcount(T))) do i
        @inline
        FT = fieldtype(typeof(data), i)
        setfield!(data, i, convert(FT, @inbounds values[i]))
        nothing
    end
    return data
end

This exploits that ntuple(::Val) is a generated function that writes an unrolled loop for you. I trust the compiler to skip creating the tuple of nothings that’s never used for anything.

1 Like

I tried something like that and it didn’t work (now I don’t know what was it). Is the @inline relevant there?

I think it’s just the same problem with names being a Tuple{Symbol}. If you lift them to Vals it’ll probably work fine for something like this, but in my experience ntuple often behaves more poorly than tuple-recursion or generated functions if the length is long or the Tuple body is complicated. For that reason I’d be a bit hesitant to recommend it, but yes, it is an option.

1 Like

Hm, I guess since names is an argument to the function the field names won’t generally be constpropped through setfield! and fieldtype, even if the loop is unrolled. That’s why @Mason is suggesting wrapping them in Val I suppose. I didn’t think too carefully about that, in my own applications I’m always looping over every field. However, if setup_data! is inlined into the caller where names is created, constprop should still work and make this version efficient. Hence, you might want to put an @inline annotation on setup_data!. Post updated accordingly.

The @inline in the do block can be important since we’re nominally capturing a type T in a closure, where it gets typed as DataType, i.e., T isn’t known at compile time and hence fieldtype(T, name) can’t be resolved at compile time even if name is known and constpropped. This is fixed by Fix type instability of closures capturing types (2) by simeonschaub · Pull Request #40985 · JuliaLang/julia · GitHub, but that PR was only merged a month ago. However, if the closure is inlined this isn’t an issue, since T is known at compile time in the outer function.

That said, I now realize you can replace T with typeof(data) in the closure to avoid this issue even if the closure isn’t inlined. Post updated accordingly.

Note that only ntuple(f, ::Val{N}) is generated, but as long as you use that method, I don’t see why it should behave worse than a generated function you write yourself. Would it be a problem that the iterations are wrapped in a :tuple expression that the compiler needs to realize it can throw away? I always end the do block with nothing to at least ensure that the tuple type is trivial to infer.

In comparison, ntuple(f, N::Int) branches at runtime to pick between a hardcoded unrolled loop for N <= 10 and a normal loop for N > 10. On the one hand, this cutoff might be preferable over generating arbitrarily long unrolled loops; on the other hand, you’re dependent on constprop to avoid the branch and type instability.

1 Like