Efficient reflection on structs

Guillaume_Leclerc · February 14, 2021, 9:24pm

Hello,

I am trying to do something to StructArray but I can’t seem to generate efficient code.

Let’s say there is struct Wrapper{T} and it is supposed to behave similarly to T except I’m managing where/how the underlying fields are stored.

I have a working version that use getproperty, fieldnames and fieldtype. The problem is that the compiler is unable to do any optimization and performance is really poor.

My second idea was to pass all the information needed as part of the type parameters (NamedTuple with types and some extra info I need) and compute this once and for all before instantiating the wrapper. The problem is that I can’t pass any structure that has a DataType as part of a type parameter.

How to generate efficient code for this kind of application ?

Mason · February 14, 2021, 9:26pm

Generally speaking, the compiler is very good at dealing with this sort of thing. Could you provide a minimal working example of the problem you’re seeing? It’s hard to advise on this sort of stuff in abstract terms.

Guillaume_Leclerc · February 14, 2021, 9:27pm

Of course! Give me a minute and I’ll give a simplified example

Guillaume_Leclerc · February 14, 2021, 9:37pm

module Storage

struct Wrapper{T}
    d::T
end

function Base.getproperty(wrapper::Wrapper{T}, s::Symbol) where {T}
    fields = fieldnames(T)
    if s in fields
        3
    end
    2
end

end

struct Data
    var_1::Int
    var_2::Float32
end

a = Data(1, 0.5)
b = Storage.Wrapper{Data}(a)

and the timings:

julia> @btime a.var_1
  31.502 ns (0 allocations: 0 bytes)
1

julia> @btime b.var_1
  170.836 ns (1 allocation: 32 bytes)
2

To be honest I’m even surprised there is a single allocation there. It’s supposed to return a constant after all optimizations are applied. It should be faster than accessing the struct.

Looking at the code llvm I’m also wondering why is there still a call to fieldnames. Given a type {T} it’s constant. Why is there no constant propagation ?

marius311 · February 14, 2021, 9:46pm

I think the issue is that constant propagation doesn’t work through the in function, its just too complex. For your MWE, this fixes it though:

function Base.getproperty(wrapper::Wrapper{T}, s::Symbol) where {T}
    if hasfield(T,s)
        3
    end
    2
end

Note also that to enable constant propagation you need the thing you’re benchmarking inside a function, so something like:

@btime (a -> a.var_1)(a)
@btime (b -> b.var_1)(b)

Guillaume_Leclerc · February 14, 2021, 9:52pm

Oh that’s smart! Thank you.

What is “too complicated”?. Is there a way to determine which ones to track down without taking time from people on the forums?

Also very weird:

function Base.getproperty(wrapper::Wrapper{T}, s::Symbol) where {T}
    fields = fieldnames(T)
    if hasfield(T, s)
        3
    end
    2
end

Is still slow. I left fields by mistake but I thought Julia is doing dead code elimination. What is going on there?

Thanks!

marius311 · February 14, 2021, 9:59pm

Sorry, I think I gave the wrong reason, the reason your original code was slow was simply because it was type unstable, because of fieldnames,

@code_warntype b.var_1

Variables
  #self#::Core.Const(getproperty)
  wrapper::Wrapper{Data}
  s::Symbol
  fields::Tuple{Vararg{Symbol, N} where N}

Body::Int64
1 ─      (fields = Main.fieldnames($(Expr(:static_parameter, 1))))
│   %2 = (s in fields)::Bool
└──      goto #2 if not %2
2 ─      return 2

I guess I’m slightly surprised by that, that seems like it could be fixed. Your dead code elimination is probably because fieldnames isn’t pure so the compiler doesn’t know that it doesn’t have other side-effects.

Guillaume_Leclerc · February 14, 2021, 10:05pm

Why was it type-unstable ? fieldnames is just a Tuple of Symbols and the length of the tuple is constant given T.

Why isn’t it pure either ? I thought by convention all functions without ! were pure. Since it’s a built-in I would assume it should follow the convention?

Thanks again. This is really helping me understand the language.

marius311 · February 14, 2021, 10:20pm

Its just an issue with Julia, not your usage of it, it just looks like fieldnames is not written in a way thats type stable (you’ll note above the length of the tuple is not inferred, since N is a free variable). It definitely seems like this could be improved, my guess as to why it is this way is just that fieldnames wasn’t meant to be used in performance-critical code, instead you have hasfield for cases like yours.

The ! is just a loose convention, there’s a separate much more strict definition of @pure. (I should probably mention that I’m not an expert on this, I think there’s probably cases where LLVM can do dead-code elimination on stuff inside non-pure Julia functions if it can figure out there’s no side-effects, that just didn’t happen in your example).

Guillaume_Leclerc · February 14, 2021, 10:29pm

Thanks! So the takeaway is essentially that it should be fast in theory but it is not.

I will of course use hasfield but what is the thing to do in that case? Do people file issues or just go around the problem when they encounter one ?

FPGro · February 14, 2021, 11:13pm

Search the GitHub issues if this has been discussed before, maybe there is a non-trivial reason for this to be the way it is. But most likely this was just not a priority, so by all means: do file an issue if there is none already!

Oh and the ! is not about purity, it’s about mutating arguments. Take print, not mutating in the common sense but not pure either.

marius311 · February 14, 2021, 11:14pm

Agree, I think it couldn’t hurt filing an Issue, although I’m not sure it’d be super high priority. There does look to be some related discussion in

Should fieldnames return a tuple rather than an array? · Issue #25327 · JuliaLang/julia · GitHub
Change fieldnames() and propertynames() to return a tuple rather than an array by nalimilan · Pull Request #25725 · JuliaLang/julia · GitHub

Guillaume_Leclerc · February 14, 2021, 11:15pm

Good point. I don’t know why I didn’t think of that…

Henrique_Becker · February 14, 2021, 11:24pm

I think that is more that anything that deals with IO is assumed to be mutating and the founders found to be a bother to suffix every IO function.

FPGro · February 15, 2021, 12:11am

Well, IO manipulation is commonly considered a side-effect, but I see your point.
Mine still stands, ! indicated that one argument will be directly mutated.
Purity is a different rabbit hole, there are various shades of purity and most of the commonly found functions are not pure in the strict sense.

Guillaume_Leclerc · February 15, 2021, 1:24am

I don’t know if I should make another post or continue here but here I go.

Any idea why this is not type-stable either:

struct Test
    a::Int
    b::Float32
end

function init(::Type{T}, s::Symbol) where T
    tpe = fieldtype(T, s)
    zero(tpe)
end

@inferred init(Test, :a)

init is parametric on T
s is a symbol so it should be able to dispatch on it
for a given T and s, tpe is constant

marius311 · February 15, 2021, 2:02am

Same reason as above, the code needs to be in a function for Julia to try and propagate the constant value :a (which it needs to do to infer this). You can combine putting it in a function and running @inferred in one line like:

@inferred (() -> init(Test, :a))()

Guillaume_Leclerc · February 15, 2021, 4:09pm

Thanks again @marius311 . The Base.pure() is very useful in making the compiler do what I want. It turns out that fieldtype was also not considered pure but wrapping it made all optimizations happen

Henrique_Becker · February 15, 2021, 4:21pm

@marius311 suggestion was not that you used Base.@pure and probably you should not be using it. See the Base.@pure documentation. If I am not wrong just calling a function that may be extended by others (or is automatically extended in some cases) make the function not eligible for @pure. Unfortunately we would need to bother someone like @JeffreySarnoff to get a better idea if fieldtype should be wrapped with @pure (I think it would already be, if this was the case, as it is a built-in function)

Guillaume_Leclerc · February 15, 2021, 4:28pm

How would I do this then without @pure:

function myInit(T, s::Symbol)
    tpe = fieldtype(T, s)
    zero(tpe)
end

With it the llvm is what I expect: a constant
Without it’s so nasty I don’t even feel like understanding what is going on (and also it just end up returning a constant)

Topic		Replies	Views
Performance benefits of writing getters for struct fields Performance	2	97	October 31, 2024
Performance of struct New to Julia	2	320	March 20, 2024
Getproperty optimization in struct Performance	1	378	May 22, 2020
Slowness of fieldnames and propertynames Internals & Design	5	1080	March 10, 2021
Long parameter for type seems very slow General Usage performance , parametric-types	18	1031	June 22, 2021

Efficient reflection on structs

Related topics