Type issue in for loop over items in struct field

Hi, I’m encountering what I think is a type(-stability?) issue in the below MWE; is there a way in which I can resolve the problem within fun_test? (I’d rather not change the structure of Aux, BasisOperation and AOperation)

abstract type BasisOperation end

struct AOperation <: BasisOperation
end

struct Aux 
    op_vec::Vector{BasisOperation}
end

function fun_test()
    aux = Aux(fill(AOperation, 10))
    for op in aux.op_vec
        continue
    end
end

@code_warntype fun_test()

Which gives

MethodInstance for fun_test()
  from fun_test() @ ...
Arguments
  #self#::Core.Const(fun_test)
Locals
  @_2::Union{Nothing, Tuple{BasisOperation, Int64}} <-- this line is red
  aux::Aux
  op::BasisOperation <-- this line is red
Body::Nothing
1 ─ %1  = Main.fill(Main.HermiteOperation, 10)::Vector{DataType}
│         (aux = Main.Aux(%1))
│   %3  = Base.getproperty(aux, :op_vec)::Vector{BasisOperation}
│         (@_2 = Base.iterate(%3))
│   %5  = (@_2 === nothing)::Bool
│   %6  = Base.not_int(%5)::Bool
└──       goto #5 if not %6
2 ┄ %8  = @_2::Tuple{BasisOperation, Int64} <-- this line is red
│         (op = Core.getfield(%8, 1))
│   %10 = Core.getfield(%8, 2)::Int64
└──       goto #3
3 ─       (@_2 = Base.iterate(%3, %10))
│   %13 = (@_2 === nothing)::Bool
│   %14 = Base.not_int(%13)::Bool
└──       goto #5 if not %14
4 ─       goto #2
5 ┄       return nothing

I’ve indicated the three lines that are red in my Julia output by <-- this line is red

Yes, you should use:

julia> struct Aux{T<:BasisOperation} 
           op_vec::Vector{T}
       end

such that the field op_vec is concrete when instantiated, to a specific subtype of BasisOperations.

2 Likes

@lmiq thank you, that solves the problem I posted. I wasn’t aware that this (rather than the for-loop) was the source of the problem, so it turns out I’ve accidentally XY-ed the problem. In my problem, I may need to combine multiple different BasisOperations in Aux, see the below MWE. Is there a similar way to resolve the problem here?

abstract type BasisOperation end

struct AOperation <: BasisOperation
end

struct BOperation <: BasisOperation
end

struct Aux{T <: BasisOperation}
    op_vec::Vector{T}
end

function fun_test()
    aux = Aux(fill(AOperation(), 10))
    for op in aux.op_vec
        continue
    end
end

function fun_test2()
    aux = Aux([AOperation(), BOperation()])
    for op in aux.op_vec
        continue
    end
end

#@code_warntype fun_test() # now works! 
@code_warntype fun_test2()

which gives

MethodInstance for fun_test2()
  from fun_test2() @ Main ...
  #self#::Core.Const(fun_test2)
Locals
  @_2::Union{Nothing, Tuple{BasisOperation, Int64}} <-- this line is red
  aux::Aux{BasisOperation}
  op::BasisOperation <-- this line is red
Body::Nothing
1 ─ %1  = Main.AOperation()::Core.Const(AOperation())
│   %2  = Main.BOperation()::Core.Const(BOperation())
│   %3  = Base.vect(%1, %2)::Vector{BasisOperation}
│         (aux = Main.Aux(%3))
│   %5  = Base.getproperty(aux, :op_vec)::Vector{BasisOperation}
│         (@_2 = Base.iterate(%5))
│   %7  = (@_2 === nothing)::Bool
│   %8  = Base.not_int(%7)::Bool
└──       goto #5 if not %8
2 ┄ %10 = @_2::Tuple{BasisOperation, Int64} <-- this line is red
│         (op = Core.getfield(%10, 1))
│   %12 = Core.getfield(%10, 2)::Int64
└──       goto #3
3 ─       (@_2 = Base.iterate(%5, %12))
│   %15 = (@_2 === nothing)::Bool
│   %16 = Base.not_int(%15)::Bool
└──       goto #5 if not %16
4 ─       goto #2
5 ┄       return nothing


Now, that’s more complicated, because your op_vec must be an abstract container, to support different types of BasisOperations. You’ll need to use some type of sum types. For example, for a recently posted and useful package:

julia> using DynamicSumTypes

julia> @sumtype SumTypeBasisOperations(AOperation, BOperation) <: BasisOperation
SumTypeBasisOperations

julia> function fun_test2()
           aux = Aux(SumTypeBasisOperations[AOperation(), BOperation()])
           for op in aux.op_vec
               continue
           end
       end
fun_test2 (generic function with 1 method)

julia> @code_warntype fun_test2() # clean

see: [ANN] DynamicSumTypes.jl v3

Maybe use a tuple instead of a vector? Do you need to mutate op_vec after it is created?

What kinds of operations are you trying to represent? What is your end-goal here?

1 Like

No, I don’t need to mutate op_vec after it is created. I totally forgot about using Tuples for things like this, that sounds like an excellent idea. If I replace Vector{T} by Tuple{T} in the previous, the problem seems solved!

In terms of operations: subtypes of BasisOperation represent certain rules to modify the value of a certain column in a matrix based on the values in other columns, so they record the target column, what other columns to use and the type of operation. I’m using this to procedurally generate matrices based on certain basis functions. (It makes sense in my context, not sure how clearly I’m bringing it across here)

If the number of different types that the tuple contains is large (>100 maybe) you may run into low compilation times. But if not, the tuples are certainly the best option.

@lmiq. Thank you for pointing this out. So I’m currently opting for an approach where I use the following

struct Aux{N, T <: BasisOperation}
    op_vec::NTuple{N, T}
end

In my use-case, N may be quite large (say several hundreds), but the number of different T’s possible is very limited (because I have about 5 subtypes of BasisOperation and BasisOperation itself. If I understand correctly, this means that only 6 different values for T are possible?). Does that sound fine?

I think tuples will cause you troubles in this case. The tuple type is specific for each field, even if they appear repeatedly:

julia> x = Real[1.0, 1, 1.0f0, 1, 1.0]
5-element Vector{Real}:
 1.0
 1
 1.0f0
 1
 1.0

julia> typeof(ntuple(i -> x[i], 5))
Tuple{Float64, Int64, Float32, Int64, Float64}

thus, if you need to store arrays with a large number of elements, this will be prohibitive for the compiler.

That is unfortunate. I’ll need to think of something else then…

I think you need sum types. But, honestly, I never used them extensively. Probably other people can give you better advice.

1 Like

In my opinion you could use either a sum type as indicated by @Imiq or a Union, I would benchmark the two approaches to see which one is faster (in Julia 1.11 the competition is usually more fierce, on <=1.10 surely a sum type would be faster) . A vector should be fine, if you know N then you can also use a StaticArrays.SizedVector. But actually it also depends on yo

1 Like

Just so I understand correctly: I would make a sum-type or Union type of my various operations, right? So something like

@sumtype BasisOperation begin
   AOperation
   BOperation
   COperation
   ... 
end

where I could still use dispatch to get the correct function (say PerformOperation(X::Matrix{Float64}, op::AOperation) when my op has type AOperation?

Or would I need to define everything by @cases?

No @cases needed (fortunately). With a sum type from DynamicSumTypes you would need to use variant to obtain the type, but then yes, you can just use normal dispatch. See the readme of DynamicSumTypes DynamicSumTypes.jl/README.md at main · JuliaDynamics/DynamicSumTypes.jl · GitHub for more info. With a Union you will not need to do anything at all.

Excellent, thank you! I’ll try them out later to see what works best in my use-case!

You wouldn’t use NTuple for heterogeneous tuple a. You would just use T where T<:Tuple.

But tuples aren’t great once the length gets into the hundreds or more.

There are other options too. If your “operations” are all representing functions with the same input and output types, there is a package that allows you to use typed functions.

FunctionWrappers.jl

2 Likes

Does the result of the whole procedure depend on the order of operation? (It sounds like it probably does, but just to be sure…)

If it doesn’t, then it might be possible to group together operations in some way to make the tuple shorter.


Also, a good ol’ if statement within the loop (instead of having to do a dynamic dispatch each iteration) might also be worth a try. Especially if the number of operation types is rather small and unlikely to be extended later, the code is still relatively concise – the main drawback of a hard-coded if-else is of course that it doesn’t extend very nicely (see the example below).

Example with if/else
using BenchmarkTools

abstract type Operation end

struct AOperation <: Operation end
struct BOperation <: Operation end
struct COperation <: Operation end
struct DOperation <: Operation end
struct EOperation <: Operation end

do_the_thing!(state, op::AOperation) = rand()
do_the_thing!(state, op::BOperation) = rand()
do_the_thing!(state, op::COperation) = rand()
do_the_thing!(state, op::DOperation) = rand()
do_the_thing!(state, op::EOperation) = rand()

function loop_the_operations(operations)
    state = nothing
    for op in operations
        do_the_thing!(state, op)
    end
    return state
end

function loop_the_operations_if_else(operations)
    state = nothing
    for op in operations
        op isa AOperation ? do_the_thing!(state, op) :
        op isa BOperation ? do_the_thing!(state, op) :
        op isa COperation ? do_the_thing!(state, op) :
        op isa DOperation ? do_the_thing!(state, op) :
        op isa EOperation ? do_the_thing!(state, op) :
        do_the_thing!(state, op)
    end
    return state
end

operations = [O() for O in rand(subtypes(Operation), 500)]

@btime loop_the_operations($operations)
# 20.841 μs (500 allocations: 7.81 KiB)
@btime loop_the_operations_if_else($operations)
#    916.324 ns (0 allocations: 0 bytes)


# We can add a new type after the fact, but have to pay the price of dynamic dispatch for the new type
struct FOperation <: Operation end

do_the_thing!(state, op::FOperation) = rand()
newOperations = [O() for O in rand(subtypes(Operation), 500)]

@btime loop_the_operations($newOperations)
# 26.730 μs (500 allocations: 7.81 KiB)

@btime loop_the_operations_if_else($newOperations)
#  1.974 μs (76 allocations: 1.19 KiB)

1 Like

There is no need for the boilerplate of if-elses in Julia 1.11, if you type the vector of operations with Union{AOperation, BOperation, COperation, DOperation, EOperation, FOperation} you will see no dynamic dispatch (or if you use a sum type neither in 1.10). But I understand that extending after the fact is harder (while usually not necessary in my opinion).

The cool fact about a sumtype is that it works like yout if-else version also when the type is abstract and you want to extend it:

julia> using DynamicSumTypes

julia> @sumtype SumOperation(AOperation, BOperation, COperation, DOperation, EOperation) <: Operation

julia> operations = Operation[SumOperation(O()) for O in rand(values(allvariants(SumOperation)), 500)];

julia> function loop_the_operations(operations)
           state = nothing
           for op in operations
               is_sumtype(typeof(op)) ? do_the_thing!(state, variant(op)) : do_the_thing!(state, op)
           end
           return state
       end;

julia> @btime loop_the_operations($operations)
  1.616 μs (0 allocations: 0 bytes)

julia> push!(operations, FOperation());

julia> @btime loop_the_operations($operations)
  1.558 μs (1 allocation: 16 bytes)

While with a Union such a trick doesn’t work.

2 Likes

I see, thank you all for your inputs and time! This gives me a lot of input to try and find out what works best in my situation!

1 Like