I want to switch out an NTuple’s element, like this:
julia> function switch_k(tup::NTuple{N,T}, elem::T, pos::Integer) where {N,T}
#_assume(0<pos<=N)
return ntuple(i-> (pos!==i) ? tup[i] : elem ,N)
end
julia> tup = ntuple(i->i+0.0, 10)
(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0)
julia> switch_k(tup, -1.5, 2)
(1.0, -1.5, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0)
The position is computed (and N is behind a function boundary for local type stability). Now the proper way of doing this is to memcopy all of tup, and then use pos as an offset into the stack memory to mutate it. Unfortunately I cannot seem to obtain this, when looking at julia> @code_llvm switch_k(tup, -1.5, 2) (or @code_native).
Instead, llvm tries to be clever and run vectorized comparison/select/shuffle. Ok, how should llvm know that I am really exchanging a single element? That only works if pos is inbounds. So let’s tell llvm about that:
julia> @inline function _assume(b::Bool)
Base.llvmcall(("declare void @llvm.assume(i1)",
"%assumption = icmp ne i8 %0, 0
call void @llvm.assume(i1 %assumption)
ret void"), Nothing, Tuple{Bool}, b)
end
julia> function switch_k(tup::NTuple{N,T}, elem::T, pos::Integer) where {N,T}
_assume(0<pos<=N)
return ntuple(i-> (pos!==i) ? tup[i] : elem ,N)
end
This does not help. So what is the right way of obtaining this? I am definitely happy with a @generated and llvmcall solution. Use is for in-place manipulation: tup = switch_k(tup, elem, pos) with run-time computed pos. Basically I want to use tuples as mutable stack-allocated C-like arrays (and hope that the optimizer will turn this into inplace modifications that uses run-time computed memory offsets; stack-based buffer-overflows FTW).