Discussion on Base.@pure usage

best not to make github work too hard :slight_smile:
I will revise the text[s] and simply post that here.
You can then copy the revised entry[s] into your PR.

Good for me.

I will already make a change right now because @vtjnash comments there.

I am very glad that our discussion lead to a doc improvement, which is covered in PR (39954 ). For help on @pure usage, please continue reading that PR, or later on the official julia Base doc, once it is released.

If anyone is interested in my original question: with a slight modification, the recursive approach for _fielddescr in V5 did the job: after wrapping the symbol by Val{s} and asserting @inbounds, compiler was able to optimize away the loop/recursion (V6). And it was surprisingly simple to solve the problem by a @generated function (V7, V8): as _fielddescr already had only type parameters in its last version, I only had to change the return value to a (constant) tuple expression. The @generated version enforces constant propagation the hard way. Benchmarks show there is no significant speed improvement in V7/V8 on the recursive version in V6. Changes are pushed to github.

The recursive solution:

@inline function _fielddescr6(::Type{PStruct{T}},::Val{s}) where {T<:NamedTuple,s} # s isa Symbol
    _fielddescr6(Tuple{T.parameters[1]...}, T.parameters[2],Val(s),0)

@inline function _fielddescr6(::Type{syms}, ::Type{types},::Val{s},shift::Int) where {syms <: Tuple, types<:Tuple, s} 
    @inbounds begin
        syms===Tuple{} && throw(ArgumentError(s))
        type = Base.tuple_type_head(types)
        if s===Base.tuple_type_head(syms)
            return type, shift, bitsizeof(type)

The @generated solution:

@generated function __fielddescr(::Type{PStruct{T}},::Val{s}) where {T<:NamedTuple,s} # s isa Symbol
    shift = 0
    types = T.parameters[2].parameters
    syms = T.parameters[1]
    idx = 1
    while idx <= length(syms)
        type = types[idx]
        bits = bitsizeof(type)
        if syms[idx]===s
            return :(($type,$shift,$bits))
        shift += bits
        idx += 1

Benchmarks (V6 to V8) show that now the same runtime speed is obtained as in handcoded constant propagation (V3 and V4), all allocations are gone.

@btime bench(sv): some work on an ordinary struct, in a loop on a Vector to get stable timings
  95.776 ns (0 allocations: 0 bytes)
@btime bench(psv): same work on PStruct having same fields as struct in preceding benchmark
  823.200 μs (761 allocations: 11.89 KiB)
@btime benchV2(psv): same work, but using getpropertyV2 instead of getproperty for PStruct field access
  498.700 μs (1161 allocations: 24.39 KiB)
@btime benchV3(psv): same work, but handcoded getpropertyV3 replacing _fielddescr call by its result (simulated constant propagation)
  144.988 ns (0 allocations: 0 bytes)
@btime benchV4(psv): same work, but handcoded getpropertyV4 with resulting SHIFT and AND operation
  133.408 ns (0 allocations: 0 bytes)
@btime benchV5(psv): like V2, but recursive _fielddescr using Base.tuple_type_head and Base.tuple_type_tail in getpropertyV5
  78.100 μs (421 allocations: 6.58 KiB)
@btime benchV6(psv): like V5, but symbol wrapped in Val like in V3 and V4 and @inline assertions
  144.656 ns (0 allocations: 0 bytes)
@btime benchV7(psv): like V8, but symbol wrapped in Val
  144.934 ns (0 allocations: 0 bytes)
@btime benchV8(psv): like V2, but _fielddescr is @generated returning a constant tuple
  144.299 ns (0 allocations: 0 bytes)
1 Like

benchmark figures in the solution were obtained under julia 1.6.0 RC1.

All other benchmarks were run under julia 1.5.3.

julia 1.6.0RC1 changes the picture - V6 gets as fast as V3/V4.