It looks like you aren’t updating n.J
in this loop? If so, you can just compute its LU factorization once and re-use it.
Also, it seems like it would be a lot easier, and possibly faster, to use SVector
and SArray
rather than MVector
and MArray
in this code. Then you can ditch all the “in-place” operations, write in a more natural style, and it still won’t allocate.