@lmiq I have done this. The question is how to call these functions without explicitly writing the 3 or 4 inputs in a performance efficient way. So far, fun(vec[1],vec[2],vec[3]) is by far the fastest way that I could think of doing this (compared to the other options in my original post).
@oxinabox I can test that, but how do I use ntuple with arbitrary indices? I can do ntuple(i -> vec[i], 3) to get vec[1:3], but how can I get (vec[4], vec[8], vec[2])?
Thanks a lot!