Efficient reflection on structs

JeffreySarnoff · February 15, 2021, 4:30pm

Use your approach (immediately above). @pure is not applicable here.

JeffreySarnoff · February 15, 2021, 4:36pm

fieldtypes cannot be @pure as it is a generic function and that is enough to disallow the use of @pure. As fieldtype selects from one of the fieldtypes, fieldtype cannot be @pure.

Henrique_Becker · February 15, 2021, 5:33pm

So… I am not sure if I understand it right, are you saying that @Guillaume_Leclerc should or not be using @pure? Your reasoning seem to indicate that myInit could not be declared as @pure too, because it calls fieldtype (and it calls fieldtypes). It also calls zero which is a generic function (and as so goes against the rules for using @pure). So, why did you tell @Guillaume_Leclerc to “use it”?

JeffreySarnoff · February 15, 2021, 6:10pm

Yes – he should not be using @pure … the “use it” was referring to his approach in the comment immediately before mine. I understand the confusion (thank you) and I am editing that response.

marius311 · February 15, 2021, 7:08pm

Like this?

julia> struct Test
           a::Int
           b::Float32
       end

julia> function init(T, s::Symbol)
           tpe = fieldtype(T, s)
           zero(tpe)
       end
init (generic function with 1 method)

julia> @code_llvm (() -> init(Test, :a))()
;  @ REPL[3]:1 within `#9'
define i64 @"julia_#9_1748"() {
top:
  ret i64 0
}

julia> @code_llvm (() -> init(Test, :b))()
;  @ REPL[4]:1 within `#11'
define float @"julia_#11_1750"() {
top:
  ret float 0.000000e+00
}

Guillaume_Leclerc · February 15, 2021, 7:32pm

You are right!
I understand what happened now.

I was using

@code_llvm (() -> my_structure.field)()  # I have my implementation of getfield that involves init

and it was generating sub optimal code without @pure.

@code_llvm (a -> a.field)(my_structure)

Is optimal even without @pure. When I think about it, it makes sense now because the function was capturing a global variable.

Now my wrapper works and generates perfect assembly.

Thanks everyone!! I really enjoy the fact that it’s possible to go really deep on the critical parts of the code base while having all the benefits of an interpreted language for the parts that don’t matter

lmiq · February 15, 2021, 7:43pm

A little bit offtopic, but you seem to be someone that will like to read these comments, as you seem to have interest in a deeper understanding and use of Julia:

Guillaume_Leclerc · February 15, 2021, 10:42pm

When I thought I was out of the woods I realized I still have an issue with the copy.

To copy my struct I need to iterate over all the fields

I have the following struct

mutable struct Data
    var_1::UInt8
end

I implemented getproperty and setproperty on the Wrapper class and it performs as expected:

@btime ((u, v) -> u.var_1 = v.var_1)(store[1, 1], store[1, 2])
  53.270 ns (2 allocations: 128 bytes)

To avoid re-implementing stuff I wanted to write my copy operation this way

function myCopy(dest, source) # Type parameters omitted for clarity
    ks = I.typesInfo # I is a type parameter with a bunch of info
    for field in keys(ks)
        setproperty!(dest, field, getproperty(source, field))
    end
end

And so far results are good

@btime ((u, v) -> myCopy(u, v))(store[1, 1], store[1, 2])
  57.141 ns (2 allocations: 128 bytes)

But if I add another field in the struct:

mutable struct Data2
    var_1::UInt8
    var_2::UInt8
end

absolute disaster:

@btime ((u, v) -> myCopy(u, v))(store[1, 1], store[1, 2])
  1.971 μs (36 allocations: 1.13 KiB)

It’s 20x slower than I would expect. My personal guess is that the compiler doesn’t unroll the loop but I’m not so sure as I can’t make sense of the generated llvm.

Any idea?

(If it can’t be solved I can always reimplement the copy with the information on the underlying storage but I thought it was more elegant/maintainable to express it this way)

Mason · February 15, 2021, 10:50pm

Again, it’s hard to help because the code you’re providing isn’t runnable because you’re omitting parts of the code ‘for clarity’. Could you show a complete example?

e.g.

What are the type parameters you’ve omitted?
What is a here?
What is store?

The nuclear option here of course is to write a generated function to get the code generation you desire, but I think it should be doable with tuple recursion. With a more complete example I can try to help.

Edit: I’m also confused because the code you show that’s slower isn’t even using the myCopy function, but instead just does a single setproperty.

@btime ((u, v) -> u.var_1 = v.var_1)(store[1, 1], store[1, 2])
  1.971 μs (36 allocations: 1.13 KiB)

Guillaume_Leclerc · February 15, 2021, 11:09pm

Sorry for the confusion I fixed the issues

store and a were the same thing and represent a structure that is essentially an StructArray. When indexed it returns a view (my wrapper) that behaves similarly to the struct.
the setproperty! (correct and fast) works on two of these view by modifying the underlying storage.

I just wanted to be able to copy a view onto another by copying all fields one by one.

I will come up with a MWE tonight but I like the idea of the tuple recursion so I will try first and only bother you if it’s not working either.

Mason · February 15, 2021, 11:22pm

Okay, I’ll await the MWE and try to assist.

But in the meantime, my understanding is that you have two structs that have the same field but different layout and you want to efficiently copy data between them to the corresponding fields.

Here’s how I’d do that with a @generated function as one way of getting around the limiations of fieldnames:

mutable struct Data1
    f1::Int
    f2::Int
    f3::Int
end

mutable struct Data2
    f3::Int
    f1::Int
    f2::Int
end

@generated function mycopy!(dest::Union{Data1, Data2}, source::Union{Data1, Data2})
    assigns = map(fieldnames(source)) do field
        :(setproperty!(dest, $(QuoteNode(field)), getproperty(source, $(QuoteNode(field)))))
    end
    ex = Expr(:block, assigns..., :dest)
end

let d = Data2(0, 0, 0), s = Data1(1, 2, 3)
    @btime mycopy!($d, $s)
end

#+RESULTS:
   2.100 ns (0 allocations: 0 bytes)
 Data2(3, 1, 2)

Guillaume_Leclerc · February 15, 2021, 11:28pm

No they are both the same type which is a view on a large array that behaves the same way as the struct. (same as StructArrays.jl)

Thanks! I haven’t taken a look at macros in Julia yet but it seems your code would almost work as it is in my case.

Guillaume_Leclerc · February 16, 2021, 4:23am

Sorry for the delay.

I tried the tuple recursion (I might have done it wrong though) and it was slower

Here is the MWE


module Storage

import Base

struct Wrapper{T, F}
    data::T
end

function wrap(data::T) where T
    fnames = fieldnames(T)
    sizes = map(x -> sizeof(fieldtype(T, x)), fnames)
    F = NamedTuple{fnames}(sizes)
    Wrapper{T, F}(data)
end

function easyCreate(T)
    v = T(map(x -> zero(fieldtype(T, x)), fieldnames(T))...)
    wrap(v)
end

function myCopy(target::Wrapper{T, F}, source::Wrapper{T, F}) where {T, F}
    for field in keys(F)
        setproperty!(target, field, getproperty(source, field))
    end
end

function Base.getproperty(wrapper::Wrapper{T, F}, s::Symbol) where {T, F}
    zero(fieldtype(T, s))
end

function Base.setproperty!(wrapper::Wrapper{T, F}, s::Symbol, v) where {T, F}
    setproperty!(getfield(wrapper, :data), s, getproperty(wrapper, s))
end

end

import .Storage
using BenchmarkTools

mutable struct Data
    a::Int
end

mutable struct Data2
    a::Int
    b::Float32
end

create = Storage.easyCreate

data1 = create(Data)
data2 = create(Data2)
@btime  (w1 -> w1.a)(data1)
@btime  ((w1, w2) -> Storage.myCopy(w1, w2))(data1, data1)
@btime  ((w1, w2) -> Storage.myCopy(w1, w2))(data2, data2)

On my machine it outputs:

  10.030 ns (0 allocations: 0 bytes)
  13.046 ns (0 allocations: 0 bytes)
  438.843 ns (1 allocation: 16 bytes)

40x slowdown so even worse than in my full implementation

I looked at llvm_code and it is going through the iterator in the second case. For some reason the compiler doesn’t unroll the loop. even if I hard code the fields instead of reading them from the type parameter I get the same slowdown.

Here is my tuple recursion implementation (I couldn’t find the head, tail syntax in julia)

function myCopyInner(target::Wrapper{T, F}, source::Wrapper{T, F}) where {T, F}
    # base case, do nothing
end

function myCopyInner(target::Wrapper{T, F}, source::Wrapper{T, F}, fields...) where {T, F}
    setproperty!(target, fields[1], getproperty(source, fields[1]))
    myCopyInner(target, source, fields[2:end]...)
end

function myCopy(target::Wrapper{T, F}, source::Wrapper{T, F}) where {T, F}
    # Same speed when hard coding the fields there
    # myCopyInner(target, source, :a, :b)
    myCopyInner(target, source, keys(F)...)
end

This one is even worse:

  541.287 ns (7 allocations: 128 bytes)

I tried everything and I really don’t have any idea what is going in the way of the loop unrolling since it fails even with hard-coded symbols

Guillaume_Leclerc · February 16, 2021, 6:48am

I adapted that in my code base and got perfect performance.

Even on large structs I get same timings as a if I was copying a Julia Array of the same size. Assembly also looks very clean

Guillaume_Leclerc · February 16, 2021, 7:04am

On top of understanding why the compiler could not unroll that for loop. I’m also interesting in the following question:

in this Wrapper struct:

struct Wrapper{T, F}
    data::T
end

Is there a way I could get something like this:

struct Wrapper{T, F} <: T
    data::T
end

This way I would be able to reuse all the implementations available for T. Since it has the same fields (accessible through getproperty) it should be swappable in place of any variable of type T.

Am I allowed to override isa or something along those lines ?

JeffreySarnoff · February 16, 2021, 8:03am

isa is a built-in function – none of those functions are overrideable.
You are welcome to fashion your own, using isa and override that.
Your backstop would be something like this.

isit(x, ::Type{T}) where T = isa(x, T)

rafael.guerra · February 16, 2021, 1:26pm

Meta question on this post: is it appropriate to have it labelled under “First Steps”?
Because it might well scare the hell out of the non-computer scientists, if those are the “first steps” to work with Julia.

JeffreySarnoff · February 16, 2021, 1:46pm

metal answer: good catch – @admin, care to remove the “First Steps” tag?

Henrique_Becker · February 16, 2021, 1:52pm

I did not found a first steps tag, but this post was in the “Usage > First Steps” category, I have relocated it to “Usage > Performance” that is more appropriate.

Guillaume_Leclerc · February 16, 2021, 5:08pm

I’m sorry I posed it in the wrong section. This was genuinely my first steps with Julia. I know it was about performance but I wanted to signal the fact that I had zero experience and might be making trivial mistake/missing obvious features of the language.