For your solution; I don’t know where transform comes from but it seems that this is enough:
reshaped=transpose(reshape(a, 4, 3))
which again is a column major matrix.Which is indeed a row major matrix.
If you just need to go row wise through your matrix, you may try eachrow
for its performance.E.g:
is a row-major matrix, not column-major, meaning that it is faster to iterate along the rows than along the columns. transpose creates a wrapper that basically switches the order of the indices.
You are right, I didn’t know that.
And as seeing is believing I had to check it out:
using BenchmarkTools
a=rand(Float64,10000,1000)
ta=transpose(a)
function sum_over_col(a)
s=0.0
for c in axes(a,2)
for r in axes(a,1)
s+=a[r,c]
end
end
s
end
function sum_over_row(a)
s=0.0
for r in axes(a,1)
for c in axes(a,2)
s+=a[r,c]
end
end
s
end
@benchmark sum_over_col($a)
@benchmark sum_over_row($a)
@benchmark sum_over_col($ta)
@benchmark sum_over_row($ta)
Yields:
julia> @benchmark sum_over_col($a)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 9.441 ms (0.00% GC)
median time: 9.781 ms (0.00% GC)
mean time: 9.887 ms (0.00% GC)
maximum time: 14.099 ms (0.00% GC)
--------------
samples: 506
evals/sample: 1
julia> @benchmark sum_over_row($a)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 33.806 ms (0.00% GC)
median time: 34.650 ms (0.00% GC)
mean time: 35.018 ms (0.00% GC)
maximum time: 40.732 ms (0.00% GC)
--------------
samples: 143
evals/sample: 1
julia> @benchmark sum_over_col($ta)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 33.866 ms (0.00% GC)
median time: 34.664 ms (0.00% GC)
mean time: 35.040 ms (0.00% GC)
maximum time: 41.581 ms (0.00% GC)
--------------
samples: 143
evals/sample: 1
julia> @benchmark sum_over_row($ta)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 9.403 ms (0.00% GC)
median time: 9.810 ms (0.00% GC)
mean time: 9.961 ms (0.00% GC)
maximum time: 11.895 ms (0.00% GC)
--------------
samples: 502
evals/sample: 1
Nice. Is there some reference on the How’s and Why’s?
Not sure about the hows, but the whys are that you want fast calculations involving transposes. You can create a transposed matrix lazily without allocating any new array, and multiplications involving transposed matrices often involves iteration along rows, so it’s a important that this is fast.