I know that arrays are column based in julia but maybe I’m missing something.
Want to subtract the column minimum from each column in a matrix and then the same with rows.

Currently doing this:

# subtract col minimum
for c=1:ncols
@views mat[:,c] .-= minimum(mat[:,c])
end
# subtract row minimum
row_minima = findmin(mat, dims=2)[1]
for r=1:nrows
@views mat[r,:] .-= row_minima[r]
end

which takes about 6s for a 16000*16000 UInt16 matrix.
The first part (col minimum) takes about 0.3s so is way faster than the second one.

First I used:

for r=1:nrows
@views mat[r,:] .-= minimum(mat[r,:])
end

which takes 10s so mine is at least faster but maybe there is something I’m missing out. Thanks in advance.

julia> function foocol!(mat)
for col in eachcol(mat)
col .-= minimum(col)
end
return mat
end;
foocol(mat) = foocol!(copy(mat));
julia> function foorow!(mat)
rmins = vec(minimum(mat, dims=2))
for col in eachcol(mat)
col .-= rmins
end
return mat
end;
foorow(mat) = foorow!(copy(mat));
julia> mat = rand(UInt16, 16000, 16000);
julia> @time foocol(mat);
@time foorow(mat);
0.477268 seconds (16.01 k allocations: 489.014 MiB, 1.75% gc time)
0.395470 seconds (16.03 k allocations: 489.045 MiB, 10.38% gc time)

Not 100% sure it does what you want but might be close enough. Basically you just want to traverse the array along the columns in both functions.

Edit:
Didn’t see/parse the answer before mine but it’s superior. Fraction of memory and time required (probably due to allocating views).
Hope that learning about eachcol is still useful but don’t use it for your problem.

If you want it be super fast, write a single nested-loop for both operations like this:

function subMins!(mat)
mcols = minimum(mat, dims=1)
mrows = minimum(mat, dims=2)
for j = 1:size(mat,2), i = 1:size(mat,1)
mat[i,j] -= mrows[i] + mcols[j]
end
end
using BenchmarkTools
mat = rand(UInt16, 16000, 16000)
@btime subMins!($mat)
290.291 ms (32 allocations: 63.41 KiB)

You can also do that with broadcasting. However, it is not equivalent to the original code, which computed the row minimums after subtracting the column minimums.

Yes, the above is not equivalent to the original code, the following should. But broadcasting is still not as fast though!

function subMins!(mat)
mcols = minimum(mat, dims=1)
for j = 1:size(mat,2), i = 1:size(mat,1)
mat[i,j] -= mcols[j]
end
mrows = minimum(mat, dims=2)
for j = 1:size(mat,2), i = 1:size(mat,1)
mat[i,j] -= mrows[i]
end
end
using BenchmarkTools
mat = rand(UInt16, 16000, 16000)
@btime subMins!($mat)
@btime begin
$mat .-= minimum($mat, dims=1)
$mat .-= minimum($mat, dims=2)
end
343.362 ms (32 allocations: 63.41 KiB) # loops
474.813 ms (32 allocations: 63.41 KiB) # broadcasting