Efficient reflection on structs

Hello,

I am trying to do something to StructArray but I can’t seem to generate efficient code.

Let’s say there is struct Wrapper{T} and it is supposed to behave similarly to T except I’m managing where/how the underlying fields are stored.

I have a working version that use getproperty, fieldnames and fieldtype. The problem is that the compiler is unable to do any optimization and performance is really poor.

My second idea was to pass all the information needed as part of the type parameters (NamedTuple with types and some extra info I need) and compute this once and for all before instantiating the wrapper. The problem is that I can’t pass any structure that has a DataType as part of a type parameter.

How to generate efficient code for this kind of application ?

1 Like

Generally speaking, the compiler is very good at dealing with this sort of thing. Could you provide a minimal working example of the problem you’re seeing? It’s hard to advise on this sort of stuff in abstract terms.

Of course! Give me a minute and I’ll give a simplified example

1 Like
module Storage

struct Wrapper{T}
    d::T
end

function Base.getproperty(wrapper::Wrapper{T}, s::Symbol) where {T}
    fields = fieldnames(T)
    if s in fields
        3
    end
    2
end

end

struct Data
    var_1::Int
    var_2::Float32
end

a = Data(1, 0.5)
b = Storage.Wrapper{Data}(a)

and the timings:

julia> @btime a.var_1
  31.502 ns (0 allocations: 0 bytes)
1

julia> @btime b.var_1
  170.836 ns (1 allocation: 32 bytes)
2

To be honest I’m even surprised there is a single allocation there. It’s supposed to return a constant after all optimizations are applied. It should be faster than accessing the struct.

Looking at the code llvm I’m also wondering why is there still a call to fieldnames. Given a type {T} it’s constant. Why is there no constant propagation ?

I think the issue is that constant propagation doesn’t work through the in function, its just too complex. For your MWE, this fixes it though:

function Base.getproperty(wrapper::Wrapper{T}, s::Symbol) where {T}
    if hasfield(T,s)
        3
    end
    2
end

Note also that to enable constant propagation you need the thing you’re benchmarking inside a function, so something like:

@btime (a -> a.var_1)(a)
@btime (b -> b.var_1)(b)
3 Likes

Oh that’s smart! Thank you.

What is “too complicated”?. Is there a way to determine which ones to track down without taking time from people on the forums?

Also very weird:

function Base.getproperty(wrapper::Wrapper{T}, s::Symbol) where {T}
    fields = fieldnames(T)
    if hasfield(T, s)
        3
    end
    2
end

Is still slow. I left fields by mistake but I thought Julia is doing dead code elimination. What is going on there?

Thanks!

Sorry, I think I gave the wrong reason, the reason your original code was slow was simply because it was type unstable, because of fieldnames,

@code_warntype b.var_1

Variables
  #self#::Core.Const(getproperty)
  wrapper::Wrapper{Data}
  s::Symbol
  fields::Tuple{Vararg{Symbol, N} where N}

Body::Int64
1 ─      (fields = Main.fieldnames($(Expr(:static_parameter, 1))))
│   %2 = (s in fields)::Bool
└──      goto #2 if not %2
2 ─      return 2

I guess I’m slightly surprised by that, that seems like it could be fixed. Your dead code elimination is probably because fieldnames isn’t pure so the compiler doesn’t know that it doesn’t have other side-effects.

1 Like

Why was it type-unstable ? fieldnames is just a Tuple of Symbols and the length of the tuple is constant given T.

Why isn’t it pure either ? I thought by convention all functions without ! were pure. Since it’s a built-in I would assume it should follow the convention?

Thanks again. This is really helping me understand the language.

Its just an issue with Julia, not your usage of it, it just looks like fieldnames is not written in a way thats type stable (you’ll note above the length of the tuple is not inferred, since N is a free variable). It definitely seems like this could be improved, my guess as to why it is this way is just that fieldnames wasn’t meant to be used in performance-critical code, instead you have hasfield for cases like yours.

The ! is just a loose convention, there’s a separate much more strict definition of @pure. (I should probably mention that I’m not an expert on this, I think there’s probably cases where LLVM can do dead-code elimination on stuff inside non-pure Julia functions if it can figure out there’s no side-effects, that just didn’t happen in your example).

2 Likes

Thanks! So the takeaway is essentially that it should be fast in theory but it is not.

I will of course use hasfield but what is the thing to do in that case? Do people file issues or just go around the problem when they encounter one ?

1 Like

Search the GitHub issues if this has been discussed before, maybe there is a non-trivial reason for this to be the way it is. But most likely this was just not a priority, so by all means: do file an issue if there is none already!

Oh and the ! is not about purity, it’s about mutating arguments. Take print, not mutating in the common sense but not pure either.

Agree, I think it couldn’t hurt filing an Issue, although I’m not sure it’d be super high priority. There does look to be some related discussion in

3 Likes

Good point. I don’t know why I didn’t think of that…

I think that is more that anything that deals with IO is assumed to be mutating and the founders found to be a bother to suffix every IO function.

1 Like

Well, IO manipulation is commonly considered a side-effect, but I see your point.
Mine still stands, ! indicated that one argument will be directly mutated.
Purity is a different rabbit hole, there are various shades of purity and most of the commonly found functions are not pure in the strict sense.

2 Likes

I don’t know if I should make another post or continue here but here I go.

Any idea why this is not type-stable either:

struct Test
    a::Int
    b::Float32
end

function init(::Type{T}, s::Symbol) where T
    tpe = fieldtype(T, s)
    zero(tpe)
end

@inferred init(Test, :a)
  • init is parametric on T
  • s is a symbol so it should be able to dispatch on it
  • for a given T and s, tpe is constant

Same reason as above, the code needs to be in a function for Julia to try and propagate the constant value :a (which it needs to do to infer this). You can combine putting it in a function and running @inferred in one line like:

@inferred (() -> init(Test, :a))()
2 Likes

Thanks again @marius311 . The Base.pure() is very useful in making the compiler do what I want. It turns out that fieldtype was also not considered pure but wrapping it made all optimizations happen :slight_smile:

@marius311 suggestion was not that you used Base.@pure and probably you should not be using it. See the Base.@pure documentation. If I am not wrong just calling a function that may be extended by others (or is automatically extended in some cases) make the function not eligible for @pure. Unfortunately we would need to bother someone like @JeffreySarnoff to get a better idea if fieldtype should be wrapped with @pure (I think it would already be, if this was the case, as it is a built-in function)

2 Likes

How would I do this then without @pure:

function myInit(T, s::Symbol)
    tpe = fieldtype(T, s)
    zero(tpe)
end

With it the llvm is what I expect: a constant
Without it’s so nasty I don’t even feel like understanding what is going on (and also it just end up returning a constant)