I am sorry, discourse told me, that I shouldn’t write so many posts in sequence and it’s better to edit previous. I didn’t write the changed code, left it as an exercise To optimize it you should use the same ideas - less allocations, less unnecessary runs through this huge matrix.
Of course if there will be any problems, I can show the solution.