I want to switch out an NTuple’s element, like this:
julia> function switch_k(tup::NTuple{N,T}, elem::T, pos::Integer) where {N,T}
#_assume(0<pos<=N)
return ntuple(i-> (pos!==i) ? tup[i] : elem ,N)
end
julia> tup = ntuple(i->i+0.0, 10)
(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0)
julia> switch_k(tup, -1.5, 2)
(1.0, -1.5, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0)
The position is computed (and N
is behind a function boundary for local type stability). Now the proper way of doing this is to memcopy all of tup
, and then use pos
as an offset into the stack memory to mutate it. Unfortunately I cannot seem to obtain this, when looking at julia> @code_llvm switch_k(tup, -1.5, 2)
(or @code_native
).
Instead, llvm tries to be clever and run vectorized comparison/select/shuffle. Ok, how should llvm know that I am really exchanging a single element? That only works if pos
is inbounds. So let’s tell llvm about that:
julia> @inline function _assume(b::Bool)
Base.llvmcall(("declare void @llvm.assume(i1)",
"%assumption = icmp ne i8 %0, 0
call void @llvm.assume(i1 %assumption)
ret void"), Nothing, Tuple{Bool}, b)
end
julia> function switch_k(tup::NTuple{N,T}, elem::T, pos::Integer) where {N,T}
_assume(0<pos<=N)
return ntuple(i-> (pos!==i) ? tup[i] : elem ,N)
end
This does not help. So what is the right way of obtaining this? I am definitely happy with a @generated
and llvmcall
solution. Use is for in-place manipulation: tup = switch_k(tup, elem, pos)
with run-time computed pos
. Basically I want to use tuples as mutable stack-allocated C-like arrays (and hope that the optimizer will turn this into inplace modifications that uses run-time computed memory offsets; stack-based buffer-overflows FTW).