My package, CliffordNumbers.jl, has a type BitIndices{Q,C<:AbstractCliffordNumber{Q}} <: AbstractVector{BitIndex{Q}}
, whose length is the same as that of C
, and that length is known at compile time. Indexing these objects depends on the type parameter C
, but these values are also known at compile time.
In principle, I would expect Julia to be able to unroll loops with iteration over an object with this type, and constant propagate each element of the type:
function iteration_test()
result = BitIndex(Val(VGA(3)))
for i in BitIndices(CliffordNumber{VGA(3)}) # 8 iterations
result *= i
end
return result
end
However, when I lower this code, I find that loop control flow overhead is introduced anyway:
julia> @code_warntype iteration_test()
MethodInstance for iteration_test()
from iteration_test() @ Main REPL[3]:1
Arguments
#self#::Core.Const(iteration_test)
Locals
@_2::Union{Nothing, Tuple{BitIndex{VGA(3)}, Tuple{Base.OneTo{Int64}, Int64}}}
result::BitIndex{VGA(3)}
i::BitIndex{VGA(3)}
Body::BitIndex{VGA(3)}
1 ─ %1 = Main.VGA(3)::Core.Const(VGA(3))
│ %2 = Main.Val(%1)::Core.Const(Val{VGA(3)}())
│ (result = Main.BitIndex(%2))
│ %4 = Main.CliffordNumber::Core.Const(CliffordNumber)
│ %5 = Main.VGA(3)::Core.Const(VGA(3))
│ %6 = Core.apply_type(%4, %5)::Core.Const(CliffordNumber{VGA(3)})
│ %7 = Main.BitIndices(%6)::Core.Const(BitIndex{VGA(3)}[BitIndex(Val(VGA(3))), BitIndex(Val(VGA(3)), 1), BitIndex(Val(VGA(3)), 2), BitIndex(Val(VGA(3)), 1, 2), BitIndex(Val(VGA(3)), 3), BitIndex(Val(VGA(3)), 1, 3), BitIndex(Val(VGA(3)), 2, 3), BitIndex(Val(VGA(3)), 1, 2, 3)])
│ (@_2 = Base.iterate(%7))
│ %9 = (@_2::Core.Const((BitIndex(Val(VGA(3))), (Base.OneTo(8), 1))) === nothing)::Core.Const(false)
│ %10 = Base.not_int(%9)::Core.Const(true)
└── goto #4 if not %10
2 ┄ %12 = @_2::Tuple{BitIndex{VGA(3)}, Tuple{Base.OneTo{Int64}, Int64}}
│ (i = Core.getfield(%12, 1))
│ %14 = Core.getfield(%12, 2)::Tuple{Base.OneTo{Int64}, Int64}
│ (result = result * i)
│ (@_2 = Base.iterate(%7, %14))
│ %17 = (@_2 === nothing)::Bool
│ %18 = Base.not_int(%17)::Bool
└── goto #4 if not %18
3 ─ goto #2
4 ┄ return result
This is indicated by @_2::Union{Nothing, Tuple{BitIndex{VGA(3)}, Tuple{Base.OneTo{Int64}, Int64}}}
, which appears due to Base.iterate(A::AbstractArray, state=(eachindex(A),))
being invoked at runtime. (The return value of nothing
indicates the end of iteration.) For this reason, I have to use @generated
functions to manually unroll loops involving iteration over this object, particularly products.
I know that there are packages that allow for the unrolling of loops using macros (Unroll.jl, Unrolled.jl) but if possible, I’d like to optimize all iteration over this type, or any types I construct with similar features (singleton array types with size and indexing known at compile time) without having the make the user do anything special. Are there assumptions baked into the AbstractArray
subtyping that prevent this optimization from occurring, or is this even possible at all?