Performance difference when accessing square matrix rows-first or cols-first

stevengj · April 13, 2021, 2:42pm

My explanation (which is the same as the one in the Julia manual) is perfectly consistent with this – accessing by column is faster because it corresponds to contiguous memory access, which is faster, and this difference becomes more important as the matrix gets larger and memory bandwidth dominates the runtime.

The timing comparison is even clearer if you pre-allocate the matrix, so that you aren’t timing the allocation itself. When you do that you’ll see that accessing by column is much faster even for relatively small matrices (until it gets so tiny that memory bandwidth ceases to be an issue). For example, I get:

m = 1
  7.042 ns (0 allocations: 0 bytes)
  7.023 ns (0 allocations: 0 bytes)
m = 10
  56.272 ns (0 allocations: 0 bytes)
  53.523 ns (0 allocations: 0 bytes)
m = 20
  187.415 ns (0 allocations: 0 bytes)
  69.617 ns (0 allocations: 0 bytes)
m = 50
  837.184 ns (0 allocations: 0 bytes)
  289.270 ns (0 allocations: 0 bytes)
m = 100
  8.792 μs (0 allocations: 0 bytes)
  1.211 μs (0 allocations: 0 bytes)
m = 1000
  1.443 ms (0 allocations: 0 bytes)
  256.103 μs (0 allocations: 0 bytes)
m = 10000
  931.414 ms (0 allocations: 0 bytes)
  62.150 ms (0 allocations: 0 bytes)

from

function matrix_0_1_rows(M)
    m, n = size(M)
    for i in 1:m
        for j in 1:n
            M[i,j] = 1
        end
    end
end

function matrix_0_1_cols(M)
    m, n = size(M)
    for j in 1:n
        for i in 1:m
            M[i,j] = 1
        end
    end
end

for m in [1, 10, 20, 50, 100, 1000, 10000]
    @show m
    M = zeros(m,m)
    @btime matrix_0_1_rows($M)
    @btime matrix_0_1_cols($M)
end

Topic		Replies	Views
The columns and the rows - puzzled about performance Performance question	8	905	October 14, 2021
Why column major? General Usage question , array , linearalgebra , column-major	59	19272	February 17, 2024
Row and column major order for arrays of different shape Performance column-major , row-major	23	2712	April 12, 2022
Faster sum loop when looping through vector multiple times than once Performance	5	291	September 29, 2023
Why substituting "for loop" with direct assignment of Matrix worsen the performance time? General Usage question	20	864	September 23, 2021

Performance difference when accessing square matrix rows-first or cols-first

Related topics