It looks like you aren’t updating n.J in this loop? If so, you can just compute its LU factorization once and re-use it.
Also, it seems like it would be a lot easier, and possibly faster, to use SVector and SArray rather than MVector and MArray in this code. Then you can ditch all the “in-place” operations, write in a more natural style, and it still won’t allocate.