Reduce allocations in row-by-row dotproduct

This will be a lot easier to write with for loops.