Subtracting row minima

I know that arrays are column based in julia but maybe I’m missing something.
Want to subtract the column minimum from each column in a matrix and then the same with rows.

Currently doing this:

# subtract col minimum
for c=1:ncols
    @views mat[:,c] .-= minimum(mat[:,c])
end
    
# subtract row minimum
row_minima = findmin(mat, dims=2)[1]
for r=1:nrows
    @views mat[r,:] .-= row_minima[r]
 end

which takes about 6s for a 16000*16000 UInt16 matrix.
The first part (col minimum) takes about 0.3s so is way faster than the second one.

First I used:

for r=1:nrows
       @views mat[r,:] .-= minimum(mat[r,:])
 end

which takes 10s so mine is at least faster but maybe there is something I’m missing out. Thanks in advance.

You can also just write mat .-= minimum(mat, dims=1) etc. I haven’t timed it but this should be pretty efficient.

2 Likes

How about:

julia> function foocol!(mat)
           for col in eachcol(mat)
               col .-= minimum(col)
           end
           return mat
       end;
       foocol(mat) = foocol!(copy(mat));

julia> function foorow!(mat)
           rmins = vec(minimum(mat, dims=2))
           for col in eachcol(mat)
               col .-= rmins
           end
           return mat
       end;
       foorow(mat) = foorow!(copy(mat));

julia> mat = rand(UInt16, 16000, 16000);
julia> @time foocol(mat);
       @time foorow(mat);
  0.477268 seconds (16.01 k allocations: 489.014 MiB, 1.75% gc time)
  0.395470 seconds (16.03 k allocations: 489.045 MiB, 10.38% gc time)

Not 100% sure it does what you want but might be close enough. Basically you just want to traverse the array along the columns in both functions.

Edit:
Didn’t see/parse the answer before mine but it’s superior. Fraction of memory and time required (probably due to allocating views).
Hope that learning about eachcol is still useful but don’t use it for your problem.

1 Like

You mean dims=2, right? For the row minimum

Oh loving it. Thank you very much! Didn’t know about eachcol. Pretty neat. Maybe I can use it somewhere else in the code base.

A .-= minimum(A, dims=1) subtracts the minimum from each column, and A .-= minimum(A, dims=2) subtracts the minimum from each row.

I’m guessing that using broadcasting like this will be faster than using eachrow and eachcol iterators. It’s certainly easier and more compact.

4 Likes

Ah yeah it seems to be neither faster nor slower but yes easier to read. Thx!

If you want it be super fast, write a single nested-loop for both operations like this:

function subMins!(mat)
    mcols = minimum(mat, dims=1)
    mrows = minimum(mat, dims=2)
    for j = 1:size(mat,2), i = 1:size(mat,1)
        mat[i,j] -= mrows[i] + mcols[j] 
    end  
end 

using BenchmarkTools
mat = rand(UInt16, 16000, 16000)
@btime subMins!($mat)
  290.291 ms (32 allocations: 63.41 KiB)

You can also do that with broadcasting. However, it is not equivalent to the original code, which computed the row minimums after subtracting the column minimums.

Yes, the above is not equivalent to the original code, the following should. But broadcasting is still not as fast though!

function subMins!(mat)
    mcols = minimum(mat, dims=1)
    for j = 1:size(mat,2), i = 1:size(mat,1)
        mat[i,j] -= mcols[j] 
    end  
    mrows = minimum(mat, dims=2)
    for j = 1:size(mat,2), i = 1:size(mat,1)
        mat[i,j] -= mrows[i]
    end 
end 

using BenchmarkTools
mat = rand(UInt16, 16000, 16000)
@btime subMins!($mat)

@btime begin 
    $mat .-= minimum($mat, dims=1)
    $mat .-= minimum($mat, dims=2)
end

  343.362 ms (32 allocations: 63.41 KiB)  # loops
  474.813 ms (32 allocations: 63.41 KiB)  # broadcasting
1 Like